Namespace and Entity Design in Feather DB: Multi-Tenant Context Isolation

The isolation problem

When you build a multi-tenant AI product — a SaaS assistant where every user has their own memory, or a multi-agent system where each agent maintains its own context — you need strict isolation. User A's memories must not appear in User B's search results, even if they're semantically similar. And you need this isolation to be fast: a per-user ANN search must not slow down as the total number of users grows.

Feather DB solves this with a two-level isolation system: namespaces for tenant boundaries, and entities for topic grouping within a namespace.

Namespaces: tenant-level isolation

A namespace is a string key that partitions the index. Every memory add and every search can be scoped to a namespace. Memories in namespace "user-alice" are completely invisible to searches in namespace "user-bob".

import feather_db as fdb

db = fdb.DB.open("saas_memory.feather", dim=768)

# Add a memory scoped to Alice
alice_vec = embed("Alice prefers dark mode and compact layouts.")
db.add(alice_vec,
       text="Alice prefers dark mode and compact layouts.",
       namespace="user-alice")

# Add a memory scoped to Bob
bob_vec = embed("Bob works on a team of 15 engineers.")
db.add(bob_vec,
       text="Bob works on a team of 15 engineers.",
       namespace="user-bob")

# Search scoped to Alice — Bob's memories never appear
query_vec = embed("What does the user prefer for their UI?")
alice_results = db.search(query_vec, k=5, namespace="user-alice")

# Search scoped to Bob — Alice's memories never appear
bob_results = db.search(query_vec, k=5, namespace="user-bob")

Under the hood, Feather DB maintains a per-namespace HNSW subgraph. ANN search on namespace "user-alice" traverses only the nodes in that namespace's subgraph. Search latency scales with the number of memories for that user, not with the total number of memories across all users — which is the property that makes multi-tenant Feather DB practical at scale.

Entities: topic grouping within a namespace

Within a namespace, you can further group memories by entity. An entity is a string that tags a memory as belonging to a logical object — a user profile, a conversation thread, a document, a project. Entities enable you to retrieve all memories associated with a specific object without a semantic query.

# Within Alice's namespace, tag memories by topic
db.add(embed("Alice likes Python over JavaScript."),
       text="Alice likes Python over JavaScript.",
       namespace="user-alice",
       entity="preferences-language")

db.add(embed("Alice is working on a FastAPI backend for their startup."),
       text="Alice is working on a FastAPI backend for their startup.",
       namespace="user-alice",
       entity="work-context")

db.add(embed("Alice mentioned her cat is named Pixel."),
       text="Alice mentioned her cat is named Pixel.",
       namespace="user-alice",
       entity="personal-facts")

# Search within Alice's namespace, filtered to preferences only
results = db.search(query_vec, k=5,
                    namespace="user-alice",
                    entity="preferences-language")

Entities are also useful for deduplication and update. If you want to replace all memories for a given entity (e.g., after a user updates their profile), you can delete by entity and re-add:

# Delete all memories for an entity and rewrite
db.delete_by_entity(namespace="user-alice", entity="preferences-language")

# Re-add updated preferences
db.add(embed("Alice now prefers TypeScript over Python for backend."),
       text="Alice now prefers TypeScript over Python for backend.",
       namespace="user-alice",
       entity="preferences-language")

Attribute filters: secondary filtering within results

After namespace and entity scoping, you can apply attribute filters to narrow results further. Attributes are key-value metadata set on individual memories. Attribute filters do not modify the ANN graph traversal — they apply as an exact post-filter over the ANN candidates.

# Add memories with type attribute
db.add(embed("Alice's Q2 OKR: ship the API v2 by June."),
       text="Alice's Q2 OKR: ship the API v2 by June.",
       namespace="user-alice",
       entity="work-context")
mem = db.get_last()
mem.meta.set_attribute("type", "goal")
mem.meta.set_attribute("quarter", "Q2-2026")

# Filter to only goal-type memories in Q2
results = db.search(query_vec, k=5,
                    namespace="user-alice",
                    filter={"type": "goal", "quarter": "Q2-2026"})

Note the correct API for setting attributes: use meta.set_attribute(key, val), not meta.attributes[key] = val. The latter does not persist due to a pybind11 copy issue where the Python dict is a copy, not a reference to the underlying C++ object.

The multi-tenant SaaS pattern

For a SaaS product serving many users, the recommended pattern is:

namespace = user_id — hard tenant boundary, enforced at the DB level
entity = topic_category — logical grouping ("preferences", "work", "personal", "current-project")
attributes — secondary metadata for filtering (type, source, confidence, created_at)

import feather_db as fdb
from datetime import datetime

db = fdb.DB.open("saas_memory.feather", dim=768)

def add_user_memory(user_id: str, text: str, category: str,
                    memory_type: str, importance: float = 1.0):
    vec = embed(text)
    mem = db.add(vec, text=text,
                 namespace=user_id,
                 entity=category)
    mem.meta.set_attribute("type", memory_type)
    mem.meta.set_attribute("created_at", datetime.utcnow().isoformat())
    mem.meta.set_attribute("importance", importance)
    return mem

def search_user_memory(user_id: str, query: str, category: str = None,
                       k: int = 5):
    vec = embed(query)
    return db.search(vec, k=k,
                     namespace=user_id,
                     entity=category)  # None = search all entities

# Usage
add_user_memory("user-42", "User works at a fintech startup.",
                category="work-context",
                memory_type="fact",
                importance=1.5)

add_user_memory("user-42", "User mentioned anxiety about the Series A.",
                category="emotional-state",
                memory_type="observation",
                importance=0.8)

results = search_user_memory("user-42",
                              "What is the user working on professionally?")

Namespace vs entity vs attribute: when to use which

Isolation layer	Scope	Use for	Search impact
namespace	Hard partition	Tenant boundaries (user_id, agent_id)	ANN subgraph — only this namespace searched
entity	Logical group	Topic categories, document sources	ANN subgraph within namespace
attribute filter	Key-value match	Type, status, date, confidence	Post-filter over ANN candidates

Namespaces are the strongest guarantee: memories in different namespaces never mix. Entities subdivide within a namespace and also use subgraph traversal. Attribute filters are the most flexible but weakest isolation — they filter after ANN, so they don't reduce traversal cost, only result set size.

Operational considerations

A single .feather file can hold thousands of namespaces. The file format stores namespace metadata in the header; the HNSW subgraphs are stored contiguously per namespace. This means a 100-user deployment and a 10,000-user deployment use the same file format and the same client code — you don't need separate files per user or a separate database per tenant.

For very large deployments (100K+ namespaces), consider sharding by namespace hash across multiple .feather files with a routing layer. Each shard file still benefits from the parallel HNSW load (4.7× faster cold start) and int8 RAM quantization (1.7× less RAM).

Install: pip install feather-db · GitHub: github.com/feather-store/feather