Context Drift: The Silent Bug Killing Long-Running AI Agents

What context drift looks like

Context drift is the gradual divergence between an agent's stored memories and current reality. It's insidious because it doesn't produce errors — it produces outdated, confidently-stated responses.

A coding assistant that remembered "we use React class components" from a conversation 8 months ago will still surface that memory today, even after 200 conversations where hooks were used exclusively. The fact is still in the vector store. Its embedding is still similar to queries about component structure. It will keep coming up.

A customer support agent that stored "user is on the Pro plan" from January will still retrieve that fact in June, even if the user downgraded in March. The vector store has no mechanism for "this fact was superseded."

The root cause: equal-weight retrieval

Most vector stores treat every stored chunk with equal retrieval weight. A fact from 3 years ago is as likely to be retrieved as a fact from yesterday, assuming similar cosine similarity to the query.

This is correct behavior for static knowledge bases — an embedding of a 1990 paper about sorting algorithms is as valid today as it was then. But for agent memory — where facts are observations about a changing world — equal-weight retrieval guarantees drift over time.

The three failure modes

1. Stale fact retrieval
A user's preference, plan status, or behavior has changed, but the old fact is still retrieved because it's similar to the query. The agent responds based on outdated information.

2. Noise accumulation
As the store grows, low-quality facts (misunderstandings, corrections, out-of-context chunks) accumulate. They're never removed and never deprioritized. Signal-to-noise degrades.

3. Missing feedback loop
The agent has no way to learn which retrieved facts actually helped answer questions correctly. Every retrieval is treated identically regardless of whether it produced a good outcome.

The fix: adaptive decay + stickiness

A context engine addresses all three failure modes:

Stale fact retrieval → temporal decay
Facts that haven't been recalled recently get a recency penalty. Stale facts stop competing with fresh ones unless the query specifically targets them.

Noise accumulation → importance weighting
Facts ingested with high importance scores (set at ingest time from signals like verification, source confidence, or frequency) maintain their ranking. Low-importance facts decay faster.

Missing feedback loop → recall-based stickiness
Each time a fact is retrieved, its recall_count increments and its last_recalled_at timestamp updates. Facts that are retrieved frequently resist decay — they become stickier. The agent implicitly learns which facts are valuable through usage patterns.

Implementing the feedback loop

import feather_db as fdb

db = fdb.DB.open("agent_memory.feather", dim=768)

# Search — automatically increments recall_count on returned nodes
results = db.search(
    query_vec, k=10,
    half_life=30,       # days until unrecalled memory halves in recency
    time_weight=0.3     # 30% weight on recency, 70% on similarity
)

# At the end of a successful interaction, update importance
# on the memories that led to a good outcome
for result in results[:3]:  # top 3 used memories
    meta = db.get_metadata(result.id)
    # Boost importance slightly for facts that helped
    new_importance = min(1.0, meta.importance + 0.05)
    db.set_metadata(result.id, fdb.Metadata(importance=new_importance))

Handling explicit updates and contradictions

When a fact is explicitly superseded ("I switched from React to Vue"), the right pattern is not just to add a new fact — it's to demote the old one:

# Find the old fact
old_facts = db.search(embed("frontend framework preference"), k=3)

# Demote it by lowering importance and adding a contradicted_by edge
for f in old_facts:
    meta = db.get_metadata(f.id)
    meta.importance = max(0.0, meta.importance - 0.5)  # penalize
    db.set_metadata(f.id, meta)
    db.link(from_id=new_fact_id, to_id=f.id, rel_type="supersedes", weight=1.0)

# Add the new fact with high importance
new_meta = fdb.Metadata(importance=0.9)
db.add(id=new_fact_id, vec=embed("User uses Vue 3 for frontend"), meta=new_meta)

The supersedes edge lets downstream reasoning (context_chain) understand that the new fact replaced the old one, rather than treating both as equally valid.

The drift timeline

With adaptive decay and stickiness:

Day 1: New fact added with fresh recency. Retrieves well.
Day 30: If not recalled, recency score has halved. Still retrieves on strong similarity queries.
Day 90: Recency contribution is near-zero. Only similarity drives retrieval — the fact needs to be very relevant to the specific query to surface.
Day 180: Effectively dormant unless directly queried by ID.

Facts that are recalled regularly stay fresh regardless of calendar time. A user's core preference for Python, recalled in every coding session, stays at day-1 recency even after a year.

This is what separates a context engine from a vector store: it doesn't just retrieve, it learns what's worth remembering.

Install: pip install feather-db · Docs: getfeather.store