Decay, Recall, Stickiness: How a Living Context Engine Remembers and Forgets
Memory is what survives a forgetting function. This post walks through how Feather DB's decay model, recall counter, and stickiness multiplier compose into a memory that actually behaves like memory.
Decay, Recall, Stickiness: How a Living Context Engine Remembers and Forgets
Theory · Living Context Engine Series · May 2026
Memory Is What Survives Forgetting
A fact you knew once and have never recalled since is, functionally, not a memory. It is a fossil. A useful memory system distinguishes between what is fossilized and what is live, and ranks accordingly. Static vector stores cannot — every entry is equally vivid, regardless of when it was added or how often it has been used. The result is a corpus where the present and the past compete on equal footing, and the past wins by volume.
A Living Context Engine inverts this. Forgetting is not a bug to be patched — it is the mechanism that keeps the present sharp. The challenge is making forgetting intelligent: shallow forgetting for frequently-used context, deep forgetting for context no one references.
The Three Counters
Every node in Feather DB carries three pieces of decay state:
{
"inserted_at": 1715760000, # epoch seconds
"recall_count": 0, # times retrieved
"importance": 1.0, # explicit priority
}
These three counters are enough to produce a rich model of memory. Each one captures a different dimension of "how present should this be."
Insertion Time → Calendar Age
Raw insertion time gives you calendar age. A node inserted 30 days ago has age_days = 30. Calendar age alone is too crude — it treats a daily-referenced doc the same as an untouched one — but it is the baseline signal.
Recall Count → Stickiness
Every successful retrieval increments the recall counter. Stickiness is logarithmic in recall count:
stickiness = 1 + ln(1 + recall_count)
The log keeps the curve bounded. Ten recalls produces stickiness ~3.4. A hundred recalls produces ~5.6. A thousand recalls produces ~7.9. Nothing dominates the index just by being looked at often.
Importance → Explicit Priority Multiplier
Some context is important regardless of when it was written or how often it has been recalled. A founder's strategic principle. A regulatory constraint. A safety guardrail. importance is an explicit multiplier on the final score — by default 1.0, configurable up to ~3.0 for material the system should never let drift.
The Composite Score
The three counters compose with similarity into a single final score at retrieval time:
stickiness = 1 + ln(1 + recall_count)
effective_age = age_days / stickiness
recency = 0.5 ** (effective_age / half_life)
score = ((1 - tw) * similarity + tw * recency) * importance
Five dimensions collapsed into one scalar. The retrieval kernel sorts by this scalar. The agent sees a ranked list where presence, freshness, fit, and explicit priority have all been considered.
What the Math Produces
Three concrete behaviors you can observe in production:
Spaced Repetition Falls Out
A piece of context recalled today has recall_count incremented, which raises stickiness, which lowers effective_age, which raises recency, which raises the final score next time. The result is a virtual spaced-repetition schedule — frequently-used context stays sharp without any external scheduler.
The Long Tail Self-Suppresses
A vector store with 10 million entries, of which 100,000 are actively used, behaves at query time like a store with 100,000 entries. The other 9.9 million decay to near-zero recency. They are still in the index — recoverable via importance overrides or topic filters — but they no longer pollute hot-path queries.
Importance Survives Time
An importance multiplier of 2.0 keeps a piece of context near the top even at very high effective ages. This is how you encode "this still matters" without retaining everything. Strategic principles get importance 2.0–3.0; routine notes get the default 1.0; transient logs get 0.5–0.8.
Tuning the Half-Life
The half_life parameter is the lever you tune per use case:
- 7 days — operational logs, ad performance traces, support ticket context.
- 30 days — campaign briefs, mid-cycle strategy documents.
- 90 days (default) — quarterly briefs, audience research, competitor intelligence.
- 365 days — foundational strategy, brand guidelines, regulatory baselines.
You can run multiple half-lives in the same store by attaching the half-life as a per-node parameter. The retrieval kernel reads it per-node and applies the right curve.
The Failure Modes
Two honest pitfalls of decay-based scoring:
- Importance inflation. If everything is marked important, nothing is. Reserve importance >1.0 for genuinely cross-cutting material. The default should be 1.0.
- Recall pollution. If your application retrieves on every keystroke, recall counts inflate without semantic justification. Use a debounced retrieval pattern, or bump
recall_countonly on confirmed-useful results (the agent quoted them, the user clicked).
Memory as a Design Material
The lesson from decay-based scoring is that memory is not a storage problem. It is a ranking problem with side effects on storage. Once you accept that, the design space opens up — you start tuning forgetting curves the way you tune cache eviction policies, and the system starts behaving like a memory rather than a search index. That is the foundation of every other Living Context Engine property.
Part of the Living Context Engine series. Next: Why RAG Stops Working After 90 Days.