What Is HNSW Indexing? How Vector Search Actually Works

HNSW (Hierarchical Navigable Small World) is an approximate nearest-neighbor (ANN) algorithm that organizes vectors into a multi-layer proximity graph for fast, high-recall search. Instead of computing distance to every stored vector (O(n) brute force), HNSW uses a hierarchical graph structure to find approximate nearest neighbors in O(log n) time. Feather DB uses HNSW with AVX2/AVX512 SIMD acceleration to achieve 0.19ms p50 search on 500K vectors with 97.2% recall@10.

The problem HNSW solves

Finding the exact nearest neighbors of a query vector in a high-dimensional space requires comparing the query to every stored vector. At 500K vectors with 768-dimensional embeddings, that's 500,000 × 768 floating-point operations per query. At scale — 10M+ vectors — this becomes prohibitively slow for real-time applications.

HNSW trades a small amount of recall accuracy for massive speed improvement. At 97.2% recall@10, HNSW returns 9.72 of the 10 true nearest neighbors on average — missing 0.28 neighbors per query. For AI memory and semantic search use cases, this is an acceptable tradeoff: the practical difference between 97.2% and 100% recall is not meaningful for retrieval quality, but the speed difference is 100–1000×.

How HNSW works: the multi-layer graph

HNSW builds a hierarchical graph with multiple layers. Each layer is a proximity graph where nodes (vectors) are connected to their approximate nearest neighbors. The key design principle: higher layers are sparse (fewer nodes, longer-range connections) and lower layers are dense (all nodes, short-range connections).

Search procedure:

Enter at a random node in the highest (sparsest) layer
Greedily navigate to the node closest to the query vector in this layer — follow edges that decrease distance
When no neighbor is closer than the current node, drop to the next lower layer and repeat
Continue descending through layers until the bottom layer is reached
At the bottom layer, collect the k nearest neighbors found during traversal

This top-down greedy traversal finds approximate nearest neighbors in O(log n) time. The hierarchical structure ensures early layers navigate quickly to the right region of the space, and final layers find precise local neighbors.

Key HNSW parameters

Parameter	Description	Effect
`M`	Max edges per node per layer	Higher M = better recall, more memory, slower build
`ef_construction`	Search width during index build	Higher = better index quality, slower build time
`ef_search`	Search width during query	Higher = better recall, slower query time
Layers	Auto-computed from M	More layers = faster search on larger datasets

Typical production settings: M=16, ef_construction=200, ef_search=50–100. These produce 97%+ recall with sub-millisecond query times on datasets up to 10M vectors.

SIMD acceleration in Feather DB

The bottleneck in HNSW search is distance computation — calculating cosine similarity or L2 distance between the query vector and candidate vectors at each graph traversal step. With 768-dimensional vectors, each distance computation is 768 multiply-accumulate operations.

Feather DB uses AVX2/AVX512 SIMD (Single Instruction Multiple Data) to compute 8 or 16 floats simultaneously per instruction, rather than one at a time. AVX2 processes 256 bits (8 floats) per cycle; AVX512 processes 512 bits (16 floats) per cycle. For distance computation:

Scalar: 768 multiply-accumulate operations per distance
AVX2: ~96 SIMD operations (8× reduction)
AVX512: ~48 SIMD operations (16× reduction)

The runtime auto-detects CPU support and uses the widest available SIMD width. This is the primary reason Feather DB achieves 0.19ms p50 at 500K vectors — SIMD reduces per-distance-computation cost by 8–16×.

HNSW vs other ANN algorithms

Algorithm	Search time	Recall@10	Memory overhead	Build time
Brute force (exact)	O(n)	100%	None	None
IVF (Inverted File)	O(n/nprobe)	90–95%	Low	Fast (requires training)
HNSW	O(log n)	97–99%	Medium (~4× data size)	Moderate
ScaNN	O(log n) + quantization	95–98%	Low (quantized)	Slow

HNSW is preferred for AI memory use cases because it doesn't require training data (IVF needs a training phase), achieves high recall without quantization loss, and performs well on incremental insertions — critical for a memory system that ingests new facts continuously.

Hybrid search: HNSW + BM25 via RRF

Dense vector search (HNSW) retrieves by semantic meaning. BM25 keyword search retrieves by term frequency and inverse document frequency — better for exact strings, IDs, proper nouns, and cases where the query contains specific terms that should match exactly.

Feather DB combines both using Reciprocal Rank Fusion (RRF). Each retrieval method returns a ranked list; RRF combines the ranks using:

RRF_score(d) = sum(1 / (k + rank_in_list_i(d)))

where k=60 is a constant and the sum is over both the dense and BM25 rank lists. This produces a merged ranking that outperforms either method alone on queries that benefit from both semantic and keyword matching.

FAQ

What does HNSW stand for?

HNSW stands for Hierarchical Navigable Small World. "Small World" refers to the graph property where most nodes can be reached from any other node in a small number of hops — the same property that makes social networks highly connected. HNSW applies this to high-dimensional vector spaces for efficient nearest-neighbor search.

How accurate is HNSW compared to exact search?

HNSW with standard settings achieves 97–99% recall@10 — meaning it finds 97–99% of the true 10 nearest neighbors. Feather DB achieves 97.2% recall@10. The 2.8% miss rate does not meaningfully affect retrieval quality for AI memory or semantic search applications.

Does HNSW support incremental insertions?

Yes. Unlike IVF-based indexes that require re-training when the dataset grows significantly, HNSW supports incremental insertions without full rebuilds. New vectors are inserted into the existing graph, maintaining search quality as the index grows. This makes HNSW well-suited for agent memory systems that continuously ingest new facts.

What is the memory overhead of HNSW?

HNSW stores the graph edges in addition to the raw vectors. Overhead is approximately 3–5× the raw vector storage size depending on M (edges per node). For 500K vectors at 768 dimensions (float32), raw storage is ~1.5GB; HNSW graph adds ~0.5–1GB overhead.

What distance metrics does Feather DB support?

Feather DB supports cosine similarity (normalized dot product), L2 (Euclidean distance), and inner product. Cosine similarity is standard for text embeddings from models like OpenAI, Cohere, or sentence-transformers. L2 is preferred for some image embedding models.