Feather DB vs ChromaDB: Which Embedded Vector DB Should You Use?
ChromaDB is the fastest way to add a vector store to your prototype. Feather DB is the right engine when that prototype becomes a production agent that needs memories which fade, strengthen, and connect over time. Here is the technical breakdown.
The one-line version
ChromaDB is the fastest way to add a vector store to your prototype. Feather DB is the right engine when that prototype becomes a production agent that needs memories which fade, strengthen, and connect over time. This post gives you the numbers to decide which situation you are actually in.
Feature comparison
| Feature | Feather DB | ChromaDB |
|---|---|---|
| Adaptive decay (memories that fade and strengthen) | Yes — recall-count stickiness + half-life | No |
| Typed graph edges + BFS traversal | Yes — 9 predefined edge types, context_chain() |
No |
| Hybrid BM25 + dense search via RRF | Yes — built-in, sub-millisecond overhead | No (dense only by default) |
| MCP server support | Yes — feather-serve exposes 14 MCP tools |
No native MCP support |
| int8 quantization (in-RAM) | Yes — set_int8_ram(), 1.76× RAM reduction |
No |
| Cold load speed | 48ms (v0.16.0, typical agent index); 1.7s at 40K vectors with parallel load | Varies; no published cold-start number |
| Python API | Yes — pybind11 C++ core | Yes — native Python |
| Rust support | Yes — feather-db-cli on crates.io |
No |
| p50 ANN latency | 0.19ms (500K × 128-dim SIFT, ef=50) | Not published at this scale |
| LangChain integration | Yes (feather-db-langchain) |
Yes (default LangChain vector store) |
| Embedded (no server required) | Yes — single .feather file |
Yes — local persistent client |
| Community size | Early-stage, growing | Large — 15k+ GitHub stars |
| License | MIT | Apache 2.0 |
What ChromaDB does well
ChromaDB earned its position as the default starting point for RAG prototypes because it genuinely is easy. You can go from pip install chromadb to a working vector search in under 20 lines of Python, the docs are thorough, and LangChain uses it as the out-of-the-box vector store in hundreds of tutorials. That matters when you are trying to validate an idea in an afternoon.
Three concrete strengths:
- Familiar API. The
collection.add()/collection.query()pattern is the closest thing the vector DB world has to a standard interface. If you have built anything with RAG before, ChromaDB's API is probably already in your muscle memory. - Large community. Stack Overflow threads, Discord support, GitHub issues with fast responses, and a large corpus of tutorials all exist for ChromaDB. If you hit an edge case at 2am, someone has probably hit it before.
- LangChain default. For any project that starts from a LangChain example — which is most projects — ChromaDB is the zero-config vector store. You do not have to change anything to use it.
What Feather DB adds
Feather DB is built around a different hypothesis: production AI agents do not just need a vector index, they need a context engine — a store that knows which memories are still relevant, which are connected to what, and which exact terms cannot be paraphrased. ChromaDB does not attempt to solve any of those problems. Feather DB is designed around all three.
Adaptive decay: memories that evolve
The core formula lives in include/scoring.h:
stickiness = 1 + log(1 + recall_count)
effective_age = age_days / stickiness
recency = 0.5 ** (effective_age / half_life)
score = ((1 - tw) * similarity + tw * recency) * importance
Every time a memory is returned by a search and acknowledged, its recall_count increments. A memory recalled ten times at day 90 has an effective age of 37.5 days instead of 90 — it behaves as though it is recent, because it keeps being useful. A memory never recalled fades naturally. No manual curation required; the retrieval pattern becomes the memory signal.
Stickiness progression at a glance:
| recall_count | stickiness | effective age at day 90 |
|---|---|---|
| 0 | 1.00 | 90.0 days |
| 5 | 1.79 | 50.3 days |
| 10 | 2.40 | 37.5 days |
| 20 | 3.04 | 29.6 days |
| 50 | 3.93 | 22.9 days |
ChromaDB has no equivalent. Every document in Chroma is as "relevant" on day 90 as it was on day 1, with no signal about whether it has actually been useful to anyone.
Typed graph edges and graph traversal
Feather DB stores typed directional edges between nodes (same_ad, contradicts, relates_to, and six more). The context_chain() call runs a two-phase retrieval: vector ANN search for seed nodes at hop=0, then BFS expansion across typed edges for hop=1, hop=2. The result is a ranked list that includes both semantically similar nodes and their graph-connected context — without any manual join logic.
ChromaDB has no graph layer. Connections between documents must be maintained externally, and traversal requires application-level code.
Hybrid BM25 + dense search
Dense vector search is excellent at semantic similarity but fails on precise tokens: model names like gpt-4o-mini vs gpt-4o, version strings, user IDs, and technical acronyms all get averaged into similar embedding neighborhoods. Feather DB's hybrid search runs both HNSW ANN and BM25, fused via Reciprocal Rank Fusion (RRF). BM25 only changes the ranking when it has strong exact-match signal; otherwise it adds no noise. The overhead is typically 1.2–1.5× pure dense latency — still sub-millisecond.
ChromaDB defaults to dense-only search. There is no built-in BM25 path.
Latency and cold load
Feather DB is a C++17 core with AVX2/AVX512 SIMD distance kernels. At 500K × 128-dim SIFT vectors with ef=50, p50 latency is 0.19ms and p99 is 0.13ms. Cold load for a typical agent-scale index (v0.16.0) comes in at 48ms. With FEATHER_LOAD_THREADS=8, a 40K × 128-dim index loads in 1.7s versus 7.6s serial — a 4.7× improvement that matters for serverless functions and pod restarts.
MCP server support
feather-serve starts an HTTP server that exposes 14 MCP tools — feather_add, feather_search, feather_hybrid_search, feather_context_chain, feather_add_edge, and more — directly connectable from Claude Desktop, Cursor, or any MCP-compatible client. ChromaDB has no native MCP surface.
int8 quantization
At 60K × 768-dim, calling db.set_int8_ram("text", max_abs=1.0) drops RAM from 227MB to 129MB with recall@10 staying at ~0.88 (vs 0.972 at float32). That 1.76× reduction is meaningful on memory-constrained hosts — a 2GB VPS, a Lambda function, an edge device. ChromaDB has no equivalent in-RAM quantization option.
Code comparison: adding an item with metadata
ChromaDB
import chromadb
client = chromadb.PersistentClient(path="./chroma_store")
collection = client.get_or_create_collection("agent_memory")
collection.add(
ids=["mem_001"],
embeddings=[embedding_vector], # your pre-computed vector
documents=["User prefers concise responses."],
metadatas=[{
"session_id": "sess_42",
"importance": 0.9,
"created_at": "2026-06-16"
}]
)
Feather DB
import feather_db as fdb
db = fdb.DB.open("agent_memory.feather", dim=768)
meta = fdb.Metadata()
meta.importance = 0.9
meta.set_attribute("session_id", "sess_42")
db.add(
vec=embedding_vector,
id=1001,
text="User prefers concise responses.",
meta=meta
)
# Feather automatically tracks recall_count and last_recalled_at.
# At query time, this memory's score is: similarity × recency × importance.
# If it gets recalled 10 times over the next 30 days, it won't decay out.
The APIs are similar in shape. The difference is what happens after ingestion: Feather's scoring engine tracks that memory over time; Chroma treats it as a static document forever.
LongMemEval: why static retrieval falls short on long-horizon tasks
LongMemEval (Xu et al., ICLR 2025) tests 500 questions against a long conversation history — ~115K tokens, ~40 sessions, most of which are distractors. Five ability axes: information-extraction, multi-session reasoning, temporal reasoning, knowledge-updates, and abstention. It is a direct measure of whether the memory system gives the answerer the right context.
Feather DB v0.8.0 + GPT-4o scores 0.693 on LongMemEval_S — beating the paper's full-context GPT-4o ceiling of 0.640, which dumps the entire 115K-token history into the model's context window. A 10-snippet retrieval from a single .feather file delivers more useful signal than the full history, at 40× lower cost per query.
A naive vector RAG pipeline — the kind you would build with ChromaDB and a standard embedding model — scores approximately 0.31 on the same benchmark (paper-reported). The gap is not a coincidence. Static retrieval has no mechanism to handle the questions LongMemEval is designed to test:
- Temporal reasoning questions (e.g., "what did I say three weeks ago about X?") require a system that weights recency differently than static cosine similarity does. Without temporal decay, a 90-day-old mention of X is indistinguishable from a mention from last week.
- Knowledge-update questions require the system to prefer the most recently relevant version of a fact. Without decay, a superseded preference scores identically to the current one.
- Multi-session reasoning is improved by graph edges that link facts across sessions — the
context_chainhop=1 expansion that Feather provides and ChromaDB does not.
The 0.693 vs ~0.31 gap is the measurable cost of using a storage engine that does not model time or connection.
| System | Answerer | LongMemEval_S |
|---|---|---|
| Feather DB | GPT-4o | 0.693 |
| Feather DB | Gemini 2.5 Flash | 0.657 (~$2.40/run) |
| Full-context GPT-4o (paper ceiling) | GPT-4o | 0.640 |
| Naive vector RAG (paper-reported) | GPT-4o | ~0.31 |
What ChromaDB is missing
This is not a criticism of ChromaDB's design goals — it was built as a straightforward vector store, not as a memory layer. But if you are evaluating it for production agent memory, three gaps matter:
- No memory decay. A preference stated by a user 180 days ago and never mentioned again scores identically to a preference from yesterday. The agent has no signal that the older one may be stale.
- No temporal weighting. Time is not a retrieval signal. Queries that are inherently time-anchored — "what did the user say last week?" — cannot be answered correctly without temporal metadata being manually inserted and filtered. Even then, recency is not blended with similarity; it is an either/or hard filter.
- No typed graph edges. Documents are isolated nodes. Knowing that memory A relates to memory B, that B updates C, or that D was derived from E — all of that structure must live outside the database. Every traversal is a manual application-layer join against the metadata you stored.
When to choose each
Choose ChromaDB when
- You are building a RAG prototype and need to ship something fast
- Your retrieval task is purely semantic — no temporal weighting, no graph structure needed
- You are teaching an LLM course or writing tutorial content and want the zero-friction default
- Your team already has ChromaDB deployed and the problem does not require decay
- You need the LangChain integration with zero configuration friction
Choose Feather DB when
- You are building a production AI agent that needs to remember things across sessions and over time
- Memories in your system become stale and you need them to fade naturally without manual curation
- You need frequently accessed memories to stay "sticky" even as calendar time passes
- Your retrieval includes exact tokens — model names, version strings, user IDs — where BM25 matters
- You want graph edges linking memories across sessions without maintaining external join tables
- You need MCP tools for Claude Desktop or Cursor without writing a custom server
- You are on a memory-constrained host and need int8 quantization to fit the index in RAM
- Latency is a primary constraint and you need a C++ HNSW core, not a Python implementation
A note on complexity
Feather DB's scoring formula has more parameters than a plain cosine similarity search. half_life, time_weight, importance, and recall_count all interact. That is intentional — agent memory is more complex than document retrieval, and the formula reflects that complexity honestly rather than hiding it. The tradeoff is that there are more knobs to tune. For a RAG prototype where all documents are equally fresh and equally important, those knobs add friction you do not need. For a long-running agent where context evolves over weeks, they are the difference between a system that actually works and one that slowly fills with stale noise.
The "SQLite vs MySQL" framing applies here too: ChromaDB is excellent for the single-machine, no-administration, get-it-running use case. Feather DB adds the features that matter when the agent runs in production and time becomes a variable in your retrieval problem.
Getting started
# Feather DB
pip install feather-db
# Start the MCP server with a real embedder
feather-serve ~/memory/agent.feather --embed-provider gemini --port 7700
- GitHub: github.com/feather-store/feather
- PyPI:
pip install feather-db - Crates.io:
feather-db-cli - Cloud waitlist: getfeather.store/cloud
Feather DB is part of Hawky.ai — AI-native development tools.