Back to Theory
Theory7 min read · June 16, 2026

What Makes Context 'Living'? Decay, Stickiness, and Feedback Loops Explained

A static vector store grows. A living context engine evolves. The difference comes down to three properties: temporal decay, recall-based stickiness, and the feedback loop they create together.

F
Feather DB
Engineering

The static store problem

A vector store is a static artifact. You add vectors. You query vectors. The store has no opinion about which memories matter more, which have become stale, or which have proven consistently useful. After 6 months of operation, a store with 50,000 entries treats each one identically during retrieval. Day-one memories compete equally with yesterday's memories. Never-recalled junk competes equally with facts the agent has surfaced 100 times.

A context engine is different. It holds three properties that a static store lacks: temporal decay, recall-based stickiness, and a feedback loop between the two. Together, these properties make the store behave less like a filing cabinet and more like working memory — where relevance is adaptive, not fixed.

Property 1: Temporal decay

Memories become less relevant with time. A preference the user stated 18 months ago may have changed. A bug fix from two years ago is probably irrelevant to today's debugging session. Temporal decay models this intuition mathematically.

In Feather DB, the recency score for a node is:

recency = 0.5 ^ (age_in_days / half_life_days)

With a default half_life of 30 days:

  • A 0-day-old memory has recency = 1.0
  • A 30-day-old memory has recency = 0.5
  • A 60-day-old memory has recency = 0.25
  • A 90-day-old memory has recency = 0.125

Recency is combined with semantic similarity in the final score:

final_score = ((1 - time_weight) * similarity + time_weight * recency) * importance

With time_weight=0.3, a 30-day-old memory needs a semantic similarity advantage of roughly 0.15 cosine distance to match a brand-new memory of equal content quality. Old memories can still surface — they just need to be more semantically relevant to compensate.

Property 2: Recall-based stickiness

Decay alone would cause a serious problem: frequently-accessed, genuinely-important facts would decay just as fast as rarely-accessed junk. A user's name, written once and recalled every day for a year, would fade after 30 days without protection.

Stickiness solves this. Each time a memory is retrieved, its recall count increments. The stickiness factor slows the effective aging rate:

stickiness    = 1 + log(1 + recall_count)
effective_age = age_in_days / stickiness
recency       = 0.5 ^ (effective_age / half_life_days)

Here is how stickiness progresses as recall count grows:

recall_countstickinesseffective age (at day 60)recency score
01.0060.0 days0.25
51.7933.5 days0.49
102.4025.0 days0.58
203.0919.4 days0.66
503.9315.3 days0.73
1004.6213.0 days0.75

A memory recalled 50 times has an effective age of 15 days even though it's actually 60 days old. It ages at 25% of the normal rate. A memory recalled 100 times is nearly as fresh as a 13-day-old memory regardless of its actual age.

This is the mathematical model of the intuition: things you use often stay accessible. Things you don't use fade.

Property 3: The feedback loop

Decay and stickiness interact through a feedback loop that gives the context engine its "living" quality:

  1. A memory is retrieved → its recall count increments → its stickiness increases
  2. Higher stickiness → lower effective age → higher recency score
  3. Higher recency score → more likely to be retrieved again
  4. More retrievals → even higher recall count → even higher stickiness

This is a positive feedback loop. Useful memories become more accessible over time. The converse is also true: memories that are never retrieved don't gain stickiness, age normally, and eventually fall below the threshold where they surface in top-k results without explicit querying.

The result is self-organization. Without any explicit curation by the developer, the agent's effective working memory concentrates around what has been useful. High-value facts rise. Stale junk sinks.

Worked example: a coding assistant

Consider a coding assistant operating for 3 months. It has stored 2,000 memory nodes. Let's trace three specific nodes:

Node A: "User's name is Ashwath" — stored 90 days ago, recalled 45 times (every session).

stickiness    = 1 + log(1 + 45) = 1 + 3.83 = 4.83
effective_age = 90 / 4.83 = 18.6 days
recency       = 0.5 ^ (18.6 / 30) = 0.65

Node B: "Bug in payment handler — fixed in PR #88" — stored 90 days ago, recalled 3 times (during the debugging session).

stickiness    = 1 + log(1 + 3) = 1 + 1.39 = 2.39
effective_age = 90 / 2.39 = 37.7 days
recency       = 0.5 ^ (37.7 / 30) = 0.42

Node C: "User asked what time the standup is" — stored 90 days ago, recalled 0 times.

stickiness    = 1 + log(1 + 0) = 1.0
effective_age = 90 / 1.0 = 90 days
recency       = 0.5 ^ (90 / 30) = 0.125

Node A (the user's name) is practically still fresh. Node B (the bug fix) is moderately aged. Node C (the standup question) has decayed to 12.5% recency — it will rarely surface in top-k results unless the query is highly specific.

This matches the intuition perfectly: the user's name should always be accessible, the bug fix should be accessible during related debugging sessions, and the standup question is trivia that should fade.

Tuning the parameters

Three parameters control the system:

  • half_life: how quickly unrecalled memories decay. Short half-life (7 days) for high-churn agents. Long half-life (90 days) for slow-evolving knowledge bases.
  • time_weight: how much recency contributes to the final score vs. pure semantic similarity. 0.0 = pure similarity (static vector store behavior). 1.0 = pure recency. 0.3 is a balanced default.
  • importance: a per-node multiplier set at ingest time. Use it to permanently boost high-confidence facts (name, preferences, explicit instructions) regardless of recall history.

These three controls let you tune the system for your workload without touching the underlying formula.

What "living" actually means

A context engine is called "living" not because it's magic, but because it exhibits emergent behavior through feedback: the useful rises, the stale sinks, and the distinction is driven by actual usage patterns rather than manual curation.

It's the difference between a library where books never leave the shelf they were first placed on, and a desk where frequently-used references stay on top and rarely-touched papers gradually migrate to the bottom of the pile. The desk is more useful. The context engine is the desk.

import feather_db as fdb

db = fdb.DB.open("living_memory.feather", dim=768)

# Each context_chain call increments recall_count for returned nodes
# — stickiness builds automatically, no extra code needed
results = db.context_chain(
    query_vec,
    k=10,
    hops=2,
    half_life=30,     # 30-day half-life for normal decay
    time_weight=0.3   # 30% recency weight
)

# Override importance at ingest for permanently sticky facts
meta = fdb.Metadata(importance=0.95)  # User identity — always surface
meta.set_attribute("text", "User's name is Ashwath")
db.add(id=1, vec=embed("User name is Ashwath"), meta=meta)

Install: pip install feather-db · GitHub: github.com/feather-store/feather