Back to Theory
Deploy8 min read · June 16, 2026

Feather DB Python Quickstart: From pip install to First Search

Everything you need to get from zero to a working semantic search in one Python file. Install, open, add, search, save — plus real embeddings, namespaces, batch ingestion, and a complete chatbot memory example.

F
Feather DB
Engineering

What you'll build

By the end of this guide you'll have a Python script that opens a .feather file, adds vectors with metadata, searches them with time-weighted adaptive scoring, and persists everything to disk. Then we'll extend it into a real chatbot memory backed by sentence-transformers or the Gemini API.

No server. No Docker. One file on disk.

1. Prerequisites

  • Python 3.9 or newer — check with python --version
  • pip — comes bundled with Python 3.9+
  • A terminal (macOS, Linux, or Windows WSL)

That's it. Feather DB ships a compiled C++/Rust core as a wheel, so there's nothing to build from source.

2. Install

pip install feather-db

The wheel includes the HNSW index, SIMD-optimised vector math (AVX2/AVX512 on x86, NEON on Apple Silicon), the BM25 hybrid layer, and the adaptive scoring engine. One command, no extra dependencies required for basic usage.

Verify the install:

import feather_db as fdb
print(fdb.__version__)  # e.g. 0.16.0

3. Basic usage: open, add, search, save

This example uses random float arrays instead of real embeddings so you can run it immediately without any API key.

import feather_db as fdb
import numpy as np

# Open (or create) a database file.
# dim= must match the dimension of your embedding vectors.
db = fdb.DB.open("quickstart.feather", dim=128)

# Add three vectors with integer IDs
rng = np.random.default_rng(42)

vec_a = rng.random(128).astype(np.float32)
vec_b = rng.random(128).astype(np.float32)
vec_c = rng.random(128).astype(np.float32)

db.add(id=1, vec=vec_a)
db.add(id=2, vec=vec_b)
db.add(id=3, vec=vec_c)

# Search: find the 2 nearest neighbours to a query vector
query = rng.random(128).astype(np.float32)
results = db.search(query, k=2)

for r in results:
    print(f"id={r.id}  score={r.score:.4f}")

# Persist to disk — the .feather file is now loadable in any future process
db.save()
db.close()

DB.open() creates the file if it doesn't exist, or loads it from disk if it does. db.save() writes the HNSW graph and all metadata atomically. db.close() flushes and releases the file handle.

4. Real embeddings

Random vectors show the mechanics. Real embeddings make search semantically meaningful. Here are two drop-in options.

Option A — sentence-transformers (local, no API key)

pip install sentence-transformers
import feather_db as fdb
import numpy as np
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")  # 384-dim, ~90 MB download

def embed(text: str) -> np.ndarray:
    return model.encode(text, normalize_embeddings=True).astype(np.float32)

db = fdb.DB.open("semantic.feather", dim=384)

facts = [
    (1, "The Eiffel Tower is in Paris."),
    (2, "Python is a high-level programming language."),
    (3, "Feather DB stores vectors in a single .feather file."),
    (4, "The speed of light is approximately 299,792 km/s."),
]

for id_, text in facts:
    db.add(id=id_, vec=embed(text))

results = db.search(embed("Where is the Eiffel Tower?"), k=2)
for r in results:
    print(f"id={r.id}  score={r.score:.4f}")
# id=1  score=0.8341  ← correct top hit

db.save()
db.close()

Option B — Gemini API (768-dim, free tier)

pip install google-generativeai
import os
import feather_db as fdb
import numpy as np
import google.generativeai as genai

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])

def embed(text: str) -> np.ndarray:
    result = genai.embed_content(
        model="models/text-embedding-004",
        content=text,
        task_type="retrieval_document"
    )
    return np.array(result["embedding"], dtype=np.float32)

# Gemini text-embedding-004 outputs 768-dim vectors
db = fdb.DB.open("gemini_semantic.feather", dim=768)

db.add(id=1, vec=embed("User prefers concise answers."))
db.add(id=2, vec=embed("User is building a FastAPI backend."))
db.add(id=3, vec=embed("User's favourite language is Python."))

query_vec = embed("What programming language does the user like?")
results = db.search(query_vec, k=2)
for r in results:
    print(f"id={r.id}  score={r.score:.4f}")

db.save()
db.close()

Both options produce np.float32 arrays. The only thing that changes is the dim= argument you pass to DB.open().

5. Key API surface

Here's the core API you'll use in 95% of Feather DB programs:

import feather_db as fdb
import numpy as np

# --- Open / close ---
db = fdb.DB.open("my.feather", dim=768)   # create or load
db.save()                                  # write to disk (call after mutations)
db.close()                                 # flush + release file handle

# --- Add a single vector ---
db.add(id=42, vec=np.zeros(768, dtype=np.float32))

# --- Add with metadata ---
meta = fdb.Metadata(importance=0.9)
db.add(id=43, vec=np.zeros(768, dtype=np.float32), meta=meta)

# --- Search ---
results = db.search(vec=np.zeros(768, dtype=np.float32), k=5)
for r in results:
    print(r.id, r.score)

# --- Count vectors in the index ---
print(db.count())

Every method is synchronous and thread-safe for concurrent reads. Writes serialize internally.

6. Metadata: Metadata(), set_attribute(), get_attribute()

Metadata attaches structured information to each vector. The importance field is a first-class multiplicative weight in the scoring formula — higher values float the memory to the top regardless of age.

import feather_db as fdb
import numpy as np

db = fdb.DB.open("meta_demo.feather", dim=128)
rng = np.random.default_rng(0)

# importance= is a float multiplier on the final score (default 1.0)
meta = fdb.Metadata(importance=1.5)

# set_attribute() stores arbitrary key-value string pairs
meta.set_attribute("text", "User explicitly stated they prefer dark mode.")
meta.set_attribute("source", "session-12")
meta.set_attribute("type", "preference")

db.add(id=1, vec=rng.random(128).astype(np.float32), meta=meta)
db.save()

# Reload from disk and read back attributes
db2 = fdb.DB.open("meta_demo.feather", dim=128)
results = db2.search(rng.random(128).astype(np.float32), k=1)
node = results[0]

if node.meta:
    print(node.meta.importance)              # 1.5
    print(node.meta.get_attribute("text"))   # "User explicitly stated..."
    print(node.meta.get_attribute("source")) # "session-12"

db2.close()

Common mistake: don't use dict-style attribute assignment

# WRONG — silently does nothing due to pybind11 copy semantics
meta.attributes["key"] = "value"

# CORRECT — always use set_attribute()
meta.set_attribute("key", "value")

The meta.attributes dict is a Python-side copy of the underlying C++ map. Assigning into it doesn't write back to the C++ object. set_attribute() does. Use it every time.

7. Search parameters: k, half_life, time_weight

Feather DB's scoring formula is:

stickiness    = 1 + log(1 + recall_count)
effective_age = age_days / stickiness
recency       = 0.5 ** (effective_age / half_life)
score         = ((1 - time_weight) * similarity + time_weight * recency) * importance

Three parameters control the retrieval behaviour:

  • k — how many results to return. Start at 5 for most applications.
  • half_life — days until an unrecalled memory's recency score halves. Default: 14. Use 7 for news agents, 30–60 for personal assistants, 180–365 for architecture decisions.
  • time_weight — fraction of the score allocated to recency vs similarity. Default: 0.3. At 0.0 this is pure similarity search (identical to a standard vector DB). At 1.0 only freshness matters.
import feather_db as fdb
import numpy as np

db = fdb.DB.open("params_demo.feather", dim=128)
rng = np.random.default_rng(7)
for i in range(20):
    db.add(id=i, vec=rng.random(128).astype(np.float32))

query = rng.random(128).astype(np.float32)

# Pure similarity — ignore time entirely
results_sim = db.search(query, k=5, time_weight=0.0)

# Default — 30% recency, 14-day half-life
results_default = db.search(query, k=5, half_life=14, time_weight=0.3)

# News agent — heavy recency, 7-day half-life
results_news = db.search(query, k=5, half_life=7, time_weight=0.5)

# Long-term knowledge base — 180-day half-life, low time weight
results_kb = db.search(query, k=5, half_life=180, time_weight=0.1)

db.close()

Tune half_life first — it has the biggest effect on which memories surface. Adjust time_weight only if you need freshness to compete with or override similarity.

8. Namespaces: default and custom

Namespaces partition the index. Memories in namespace "user-alice" are completely invisible to searches in namespace "user-bob". ANN traversal is scoped per namespace, so search latency scales with the number of memories for that user — not the total number of memories across all users.

import feather_db as fdb
import numpy as np

db = fdb.DB.open("multi_user.feather", dim=128)
rng = np.random.default_rng(99)

def rand_vec():
    return rng.random(128).astype(np.float32)

# Default namespace — no namespace= argument needed
db.add(id=1, vec=rand_vec())

# Custom namespaces — hard tenant boundaries
db.add(id=10, vec=rand_vec(), namespace="user-alice")
db.add(id=11, vec=rand_vec(), namespace="user-alice")
db.add(id=20, vec=rand_vec(), namespace="user-bob")

query = rand_vec()

# Default namespace search — never returns user-alice or user-bob results
default_results = db.search(query, k=5)

# alice's namespace — bob's memories never appear
alice_results = db.search(query, k=5, namespace="user-alice")

# bob's namespace — alice's memories never appear
bob_results = db.search(query, k=5, namespace="user-bob")

db.save()
db.close()

One .feather file can hold thousands of namespaces. Use namespace=user_id for multi-tenant SaaS, namespace=agent_id for multi-agent systems, or leave it unset for single-user applications.

9. add_batch() for bulk ingestion

Sequential db.add() calls cross the Python/C++ boundary on every insert. For large corpora, use add_batch() — it releases the GIL and builds the HNSW graph in parallel across all available CPU cores.

import feather_db as fdb
import numpy as np

db = fdb.DB.open("corpus.feather", dim=768)

# Prepare data as numpy arrays
N = 50_000
ids  = list(range(N))
vecs = np.random.randn(N, 768).astype(np.float32)

# Optional: attach metadata to each vector
metas = []
for i in range(N):
    m = fdb.Metadata(importance=0.8)
    m.set_attribute("source", "batch_import")
    m.set_attribute("chunk_index", str(i))
    metas.append(m)

# Single parallel call — ~3.4× faster than a sequential add() loop
db.add_batch(ids, vecs, metas=metas)
db.save()
db.close()

Benchmark on a 4-core machine, 50k × 768-dim vectors:

MethodTimeSpeedup
Sequential add() loop~34s
add_batch()~10s3.4×

On an 8-core machine expect 5–6×. Use add_batch() any time you're inserting more than ~1k vectors at once. Use sequential add() for real-time, single-item inserts (e.g. writing a new memory immediately after a conversation turn).

10. v0.16.0 fast cold load: save() persists the graph, open() loads in 48ms

Before v0.16.0, DB.open() on a non-empty file rebuilt the HNSW graph from scratch. At 40k vectors that took ~7 seconds. v0.16.0 persists the compiled graph structure inside the .feather file so open() can memory-map it directly.

import time
import feather_db as fdb
import numpy as np

# First run: build and save (happens once)
db = fdb.DB.open("fast_load.feather", dim=768)
vecs = np.random.randn(40_000, 768).astype(np.float32)
db.add_batch(list(range(40_000)), vecs)
db.save()   # persists HNSW graph — subsequent opens skip rebuild
db.close()

# Every subsequent run: graph loads from disk in ~48ms
t0 = time.perf_counter()
db2 = fdb.DB.open("fast_load.feather", dim=768)
elapsed = (time.perf_counter() - t0) * 1000
print(f"Loaded {db2.count()} vectors in {elapsed:.0f}ms")
# → Loaded 40000 vectors in 48ms

db2.close()

This matters most for serverless functions, Kubernetes pods with frequent restarts, and development loops where you restart the process after every code change. Call db.save() after any batch of mutations to keep the on-disk graph current.

Combine with FEATHER_LOAD_THREADS for even faster loads on larger indexes:

import os
os.environ["FEATHER_LOAD_THREADS"] = "8"   # parallel HNSW reconstruction
db = fdb.DB.open("large_corpus.feather", dim=768)

11. Common mistakes

meta.attributes[key] = val — silently does nothing

# WRONG
meta = fdb.Metadata(importance=1.0)
meta.attributes["text"] = "User prefers Python"  # no-op — pybind11 copy issue

# CORRECT
meta = fdb.Metadata(importance=1.0)
meta.set_attribute("text", "User prefers Python")  # writes to C++ object

Forgetting db.save() after mutations

db.add(id=1, vec=vec)
# Process crashes or exits here — the add() is lost

# Correct pattern: save after every logical batch
db.add(id=1, vec=vec)
db.add(id=2, vec=vec2)
db.save()  # atomic write — all-or-nothing

Mixing dim= across open() calls

# Created with dim=768
db = fdb.DB.open("memory.feather", dim=768)
db.add(id=1, vec=np.zeros(768, dtype=np.float32))
db.save()
db.close()

# Reopened with wrong dim — raises ValueError
db2 = fdb.DB.open("memory.feather", dim=1536)  # mismatch!

The dim= is stored in the file header. Always pass the same value that was used when the file was first created.

Complete example: chatbot memory

This is a minimal but complete persistent chatbot memory. It works with any embedding function — swap in sentence-transformers, Gemini, or OpenAI by replacing the embed() function.

"""
chatbot_memory.py — persistent memory for a chatbot using Feather DB.

Install:
    pip install feather-db sentence-transformers openai

Usage:
    python chatbot_memory.py
"""
import os
import time
import numpy as np
import feather_db as fdb
from openai import OpenAI

# --- Embedding setup (swap this block for any provider) ---
from sentence_transformers import SentenceTransformer
_model = SentenceTransformer("all-MiniLM-L6-v2")

def embed(text: str) -> np.ndarray:
    return _model.encode(text, normalize_embeddings=True).astype(np.float32)

DIM = 384   # all-MiniLM-L6-v2 output dimension

# --- Feather DB setup ---
db = fdb.DB.open("chatbot_memory.feather", dim=DIM)

# --- Memory helpers ---
def remember(text: str, importance: float = 0.5, source: str = "chat") -> int:
    """Store a piece of text in persistent memory."""
    node_id = int(time.time() * 1000) % (2 ** 31)
    meta = fdb.Metadata(importance=importance)
    meta.set_attribute("text", text)
    meta.set_attribute("source", source)
    meta.set_attribute("ts", str(time.time()))
    db.add(id=node_id, vec=embed(text), meta=meta)
    return node_id

def recall(query: str, k: int = 5) -> list[str]:
    """Retrieve the k most relevant memories for a query."""
    results = db.search(
        vec=embed(query),
        k=k,
        half_life=30,       # memories decay over 30 days if not recalled
        time_weight=0.3     # 30% recency, 70% semantic similarity
    )
    texts = []
    for r in results:
        if r.meta:
            t = r.meta.get_attribute("text")
            if t:
                texts.append(t)
    return texts

# --- Chat loop ---
llm = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

def chat(user_message: str) -> str:
    # 1. Retrieve relevant memories
    memories = recall(user_message, k=5)
    memory_block = "\n".join(f"- {m}" for m in memories)

    # 2. Call the LLM with memory context injected
    response = llm.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a helpful assistant with persistent memory.\n"
                    "Relevant facts from past conversations:\n"
                    + (memory_block or "(no memories yet)")
                )
            },
            {"role": "user", "content": user_message}
        ]
    )
    reply = response.choices[0].message.content

    # 3. Write the exchange back to memory
    exchange = f"User: {user_message[:120]}  Assistant: {reply[:200]}"
    remember(exchange, importance=0.4, source="session")

    # 4. Save after every turn — 48ms cold load means no batching needed
    db.save()

    return reply

# Seed some initial facts about the user (run once)
if db.count() == 0:
    remember("User's name is Ashwath.", importance=0.9, source="identity")
    remember("User is building an AI agent for customer support.", importance=0.8, source="context")
    remember("User prefers concise answers, no filler phrases.", importance=0.85, source="preference")
    db.save()
    print("Memory seeded. Starting chat...\n")

# Interactive loop
while True:
    try:
        user_input = input("You: ").strip()
    except (EOFError, KeyboardInterrupt):
        break
    if not user_input or user_input.lower() in ("exit", "quit"):
        break
    print(f"Assistant: {chat(user_input)}\n")

db.close()
print("Bye. Memories saved to chatbot_memory.feather.")

What this does on each turn:

  1. Embeds the user message and searches the .feather file for the 5 most relevant memories, scored by semantic similarity × recency × importance.
  2. Injects those memories into the system prompt so the LLM knows what it needs to know — without sending the entire history every time.
  3. Writes the exchange back as a new memory node so knowledge accumulates across sessions.
  4. Calls db.save() so nothing is lost if the process exits.

Restart the script and the memories survive. The chatbot remembers across processes, sessions, and deployments — because it's just a file.

What's next

Install: pip install feather-db