Inside Feather DB: The Embedded Vector-Graph Architecture in 6,000 Lines of Rust

Architecture Deep Dive · Feather DB Internals · May 2026

Why an Embedded Engine, Not a Service

Most vector databases ship as a service. You deploy a container, expose a port, wire up auth, scale a cluster, and watch a dashboard. That model makes sense when the database is the system of record for an entire company. It is a poor fit when the database is a context layer — a per-agent, per-tenant memory that needs to be cheap to spin up, easy to back up, and impossible to leak across boundaries.

Feather DB is embedded by design. The engine is a Rust crate. The Python package is a PyO3 binding. The on-disk format is one file. There is no daemon, no broker, no replica election, no admin UI. An agent that wants its own memory imports a library, opens a file, and starts writing context.

This is a deliberate constraint, not a missing feature. Every piece of operational surface you remove is one less thing that can desynchronize between an agent's reasoning and its memory.

The Three Components Inside the Binary

Feather DB is ~6,000 lines of Rust. Internally it composes three subsystems into a single store:

┌─────────────────────────────────────────────────┐
│                  Feather DB                      │
│                                                  │
│  ┌────────────┐  ┌────────────┐  ┌───────────┐  │
│  │   HNSW     │  │   Typed    │  │  Adaptive │  │
│  │   index    │  │   graph    │  │  scoring  │  │
│  │ (vectors)  │  │  (edges)   │  │  (decay)  │  │
│  └─────┬──────┘  └──────┬─────┘  └─────┬─────┘  │
│        └────────┬───────┴──────────────┘        │
│                 ▼                                │
│         Unified node store                       │
│   (vector + payload + edges + decay state)      │
│                                                  │
└─────────────────────────────────────────────────┘

The unified node store is the architectural primitive. A node is not a vector with a sidecar payload. A node is a single record that owns its vector, its typed metadata, its outgoing edges, and the scoring state needed to decay over time.

1. The HNSW Index

Approximate nearest-neighbor search is implemented as a small, allocation-conscious HNSW. The graph layers are stored as flat slabs of u32 neighbor IDs to keep cache locality predictable. Insert is single-writer and uses heuristic neighbor selection from the Malkov & Yashunin paper. Search is multi-threaded across a Rayon pool.

What matters for a context engine is not raw QPS — most retrieval workloads on a context store are bursty and small — but recall stability under churn. Nodes are added and removed continuously as memory fades and new signals arrive. Feather's HNSW maintains a free-list of evicted IDs and reuses them, so the graph does not grow unbounded.

2. The Typed Property Graph

Every node can have typed outgoing edges. Edge types are part of the schema: derived_from, responds_to, contradicts, variant_of, and so on. Edges are first-class — they have payloads, they participate in retrieval, and they can be queried directly.

The graph is stored as a compressed sparse row (CSR) structure parallel to the HNSW layer. A traversal from a seed node never leaves CPU cache — it walks an array of edge IDs, dereferences neighbor records, and emits results. There is no separate graph database, no Cypher engine, no transaction coordinator. Graph traversal is a function over the same memory the vector index uses.

3. Adaptive Scoring

Every node carries decay state: an insertion timestamp, a recall counter, an importance multiplier. At query time, the score is computed as:

stickiness    = 1 + ln(1 + recall_count)
effective_age = age_days / stickiness
recency       = 0.5 ** (effective_age / half_life)
score         = ((1 - tw) * similarity + tw * recency) * importance

This is not a sidecar feature — it is woven into the retrieval kernel. Every ANN candidate is scored, every graph hop is scored, every blended result is scored. The scoring kernel is branch-free SIMD on the hot path; the per-result overhead is single-digit nanoseconds.

The Single-File Format

On disk, Feather DB is a single file. The format is a versioned binary container with three internal regions: a fixed header, a node table (vectors + payloads + decay state), and a graph region (CSR edge lists). The file is mmap'd on open. Cold-start latency for a 100k-node store is around 40 milliseconds — basically the cost of paging in the header.

Writes go through a write-ahead log adjacent to the main file. The WAL is compacted on close and on demand. There is no replication layer because there is nothing to replicate against — the file is the truth, and the file is portable.

Why Embedded + Single-File Is the Right Trade

The traditional database mindset says: more nodes, more durability, more replication, more isolation. The context engine mindset says: more agents, more memories, more boundaries, more portability. Embedded wins on the second axis.

An agent's memory should be cheap enough to spin up per session, durable enough to survive a restart, and portable enough to attach to a checkpoint. A single file gives you all three without operations. That is the architectural bet Feather DB is making — and the next four posts in this series unpack each subsystem in detail.

Continue with the architecture series: Adaptive Decay · HNSW + Typed Edges · Single-File Storage · The 768-Dimension Bet.