Decay, Recall, Stickiness: How a Living Context Engine Remembers and Forgets

Theory · Living Context Engine Series · May 2026

Memory Is What Survives Forgetting

A fact you knew once and have never recalled since is, functionally, not a memory. It is a fossil. A useful memory system distinguishes between what is fossilized and what is live, and ranks accordingly. Static vector stores cannot — every entry is equally vivid, regardless of when it was added or how often it has been used. The result is a corpus where the present and the past compete on equal footing, and the past wins by volume.

A Living Context Engine inverts this. Forgetting is not a bug to be patched — it is the mechanism that keeps the present sharp. The challenge is making forgetting intelligent: shallow forgetting for frequently-used context, deep forgetting for context no one references.

The Three Counters

Every node in Feather DB carries three pieces of decay state:

{
  "inserted_at":  1715760000,   # epoch seconds
  "recall_count": 0,            # times retrieved
  "importance":   1.0,          # explicit priority
}

These three counters are enough to produce a rich model of memory. Each one captures a different dimension of "how present should this be."

Insertion Time → Calendar Age

Raw insertion time gives you calendar age. A node inserted 30 days ago has age_days = 30. Calendar age alone is too crude — it treats a daily-referenced doc the same as an untouched one — but it is the baseline signal.

Recall Count → Stickiness

Every successful retrieval increments the recall counter. Stickiness is logarithmic in recall count:

stickiness = 1 + ln(1 + recall_count)

The log keeps the curve bounded. Ten recalls produces stickiness ~3.4. A hundred recalls produces ~5.6. A thousand recalls produces ~7.9. Nothing dominates the index just by being looked at often.

Importance → Explicit Priority Multiplier

Some context is important regardless of when it was written or how often it has been recalled. A founder's strategic principle. A regulatory constraint. A safety guardrail. importance is an explicit multiplier on the final score — by default 1.0, configurable up to ~3.0 for material the system should never let drift.

The Composite Score

The three counters compose with similarity into a single final score at retrieval time:

stickiness    = 1 + ln(1 + recall_count)
effective_age = age_days / stickiness
recency       = 0.5 ** (effective_age / half_life)
score         = ((1 - tw) * similarity + tw * recency) * importance

Five dimensions collapsed into one scalar. The retrieval kernel sorts by this scalar. The agent sees a ranked list where presence, freshness, fit, and explicit priority have all been considered.

What the Math Produces

Three concrete behaviors you can observe in production:

Spaced Repetition Falls Out

A piece of context recalled today has recall_count incremented, which raises stickiness, which lowers effective_age, which raises recency, which raises the final score next time. The result is a virtual spaced-repetition schedule — frequently-used context stays sharp without any external scheduler.

The Long Tail Self-Suppresses

A vector store with 10 million entries, of which 100,000 are actively used, behaves at query time like a store with 100,000 entries. The other 9.9 million decay to near-zero recency. They are still in the index — recoverable via importance overrides or topic filters — but they no longer pollute hot-path queries.

Importance Survives Time

An importance multiplier of 2.0 keeps a piece of context near the top even at very high effective ages. This is how you encode "this still matters" without retaining everything. Strategic principles get importance 2.0–3.0; routine notes get the default 1.0; transient logs get 0.5–0.8.

Tuning the Half-Life

The half_life parameter is the lever you tune per use case:

7 days — operational logs, ad performance traces, support ticket context.
30 days — campaign briefs, mid-cycle strategy documents.
90 days (default) — quarterly briefs, audience research, competitor intelligence.
365 days — foundational strategy, brand guidelines, regulatory baselines.

You can run multiple half-lives in the same store by attaching the half-life as a per-node parameter. The retrieval kernel reads it per-node and applies the right curve.

The Failure Modes

Two honest pitfalls of decay-based scoring:

Importance inflation. If everything is marked important, nothing is. Reserve importance >1.0 for genuinely cross-cutting material. The default should be 1.0.
Recall pollution. If your application retrieves on every keystroke, recall counts inflate without semantic justification. Use a debounced retrieval pattern, or bump recall_count only on confirmed-useful results (the agent quoted them, the user clicked).

Memory as a Design Material

The lesson from decay-based scoring is that memory is not a storage problem. It is a ranking problem with side effects on storage. Once you accept that, the design space opens up — you start tuning forgetting curves the way you tune cache eviction policies, and the system starts behaving like a memory rather than a search index. That is the foundation of every other Living Context Engine property.

Part of the Living Context Engine series. Next: Why RAG Stops Working After 90 Days.