Back to Theory
Architecture7 min read · June 16, 2026

What Is MCP (Model Context Protocol)? And Why It Needs a Context Engine

MCP is Anthropic's open protocol for wiring AI models to external tools and data. feather-serve turns Feather DB into an MCP server — giving Claude 14 memory tools, persistent semantic recall, and a context engine that survives across conversations.

F
Feather DB
Engineering

What MCP is

Model Context Protocol (MCP) is an open protocol published by Anthropic that standardizes how AI models communicate with external tools and data sources. Think of it as the USB-C of AI integrations: before MCP, every tool integration required its own bespoke API wrapper, auth flow, and message format. After MCP, any compliant host (Claude Desktop, Claude Code, Cursor, a custom agent loop) can connect to any compliant server using the same protocol.

MCP is model-agnostic by design. Claude, GPT-4o, and Gemini can all consume MCP servers. The protocol defines the handshake, the tool schema format, and the request/response envelope — not anything specific to Anthropic's models.

The spec is open and available at modelcontextprotocol.io.

Why MCP exists

LLMs are stateless by default. They receive a prompt, generate a response, and stop. They have no native way to read a file, call an API, or remember what happened last Tuesday. Before MCP, connecting a model to external capabilities meant writing custom integration code for every tool and every model combination.

MCP standardizes this at the protocol level. An MCP server exposes a list of tools — callable functions with a name, description, and input schema. The host (Claude Desktop, for example) fetches the tool list at startup and makes it available to the model. When the model decides to use a tool, it emits a structured tool-call; the host routes it to the server; the server executes and returns a result; the model uses the result to construct its response.

The key insight: tool integration becomes a deployment concern, not a code concern. Once a server is MCP-compliant, any compliant host can use it without writing any glue code.

MCP transport: stdio, HTTP, SSE

MCP defines three transport modes:

  • stdio — The host spawns the MCP server as a subprocess and communicates over stdin/stdout. This is the default for local tool servers. Low latency, no network required, dies when the host process dies.
  • HTTP (Streamable HTTP) — The MCP server runs as an HTTP service. The host sends requests and receives responses over standard HTTP. Works across machines and survives host restarts.
  • SSE (Server-Sent Events) — A streaming variant of HTTP transport. The server keeps a connection open and pushes events to the host. Useful for long-running tool calls or real-time streaming results.

Feather DB's feather-serve uses HTTP/SSE transport, exposing its MCP endpoint at /mcp. This means the context engine can run locally on port 7700, on a remote server, or on Feather Cloud — and Claude connects to it the same way in all three cases.

How Claude uses MCP: the request lifecycle

Here is the full lifecycle of an MCP-backed Claude conversation:

  1. Startup: Claude Desktop reads claude_desktop_config.json and connects to configured MCP servers. It calls the tools/list endpoint on each server to get the available tool schemas.
  2. Conversation start: The tool schemas are injected into Claude's context. Claude now "knows" what tools exist, what they accept, and what they return.
  3. Tool decision: During a conversation, Claude decides to use a tool. It emits a structured tool_use block with the tool name and arguments.
  4. Execution: The host (Claude Desktop) intercepts the tool_use block, routes the call to the appropriate MCP server, and waits for a result.
  5. Result injection: The MCP server returns a result. The host injects it back into the conversation as a tool_result block.
  6. Synthesis: Claude reads the tool result and continues generating its response, now grounded in the external data.

This cycle can repeat multiple times in a single turn. Claude might call feather_search to check what it already knows, read the results, call feather_context_chain to traverse connected memories, and then synthesize a response from both — all before producing a single line of visible text.

Feather DB's MCP server: feather-serve

Feather DB v0.14.0 ships feather-serve as a first-class MCP server. It exposes 14 tools covering the full lifecycle of a context engine: store, retrieve, traverse, link, inspect, and manage.

ToolWhat it does
feather_searchANN semantic search over the memory store
feather_addStore a new memory node (text embedded by feather-serve)
feather_context_chainSemantic search + BFS graph traversal in one call
feather_linkAdd a typed edge between two memory nodes
feather_getRetrieve a node by ID
feather_deleteRemove a node and its edges
feather_keyword_searchBM25 keyword search
feather_batch_addAdd multiple nodes in parallel
feather_set_metadataUpdate node attributes
feather_get_metadataRead node attributes
feather_list_edgesList edges for a node
feather_graph_statsCount nodes, edges, namespaces
feather_list_namespacesShow all namespaces in the store
feather_healthLiveness check

The key distinction from a generic vector search tool: feather_context_chain is not a lookup. It seeds with ANN search, then traverses typed graph edges — surfacing not just the closest memory but its connected context. A stored memory about a bug fix can be linked to the original problem report, the architectural decision that caused it, and the preference that drove the decision. One call, all of it.

Starting feather-serve

Install Feather DB and start the MCP server with a real embedding provider:

pip install feather-db

# Gemini embeddings (768-dim, free tier available)
GOOGLE_API_KEY=your-key feather-serve ~/feather-memory/claude_memory.feather \
  --embed-provider gemini \
  --dim 768 \
  --port 7700

# OR: OpenAI text-embedding-3-small (1536-dim)
OPENAI_API_KEY=your-key feather-serve ~/feather-memory/claude_memory.feather \
  --embed-provider openai \
  --dim 1536 \
  --port 7700

# OR: fully offline with Ollama
feather-serve ~/feather-memory/claude_memory.feather \
  --embed-provider ollama \
  --ollama-model nomic-embed-text \
  --dim 1024 \
  --port 7700

The server prints: Feather DB serving at http://localhost:7700 | MCP at http://localhost:7700/mcp

An admin UI is also available at http://localhost:7700/admin/ for inspecting stored memories, running manual searches, and viewing the graph.

Claude Desktop config.json

Once feather-serve is running, tell Claude Desktop about it by editing ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) — or the equivalent on Windows (%APPDATA%\Claude\claude_desktop_config.json) and Linux (~/.config/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "feather-memory": {
      "url": "http://localhost:7700/mcp"
    }
  }
}

Restart Claude Desktop. The feather-memory server appears in the tools list (hammer icon in the message input area). Claude can now call any of the 14 Feather DB tools mid-conversation.

If you're running an older version of Claude Desktop that requires the SSE bridge pattern, use:

{
  "mcpServers": {
    "feather-memory": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-http-sse"],
      "env": {
        "MCP_SERVER_URL": "http://localhost:7700/mcp"
      }
    }
  }
}

What this enables: Claude remembers across conversations

Claude's context window resets on every new conversation. Without external memory, everything you told Claude last session is gone. With Feather DB as an MCP backend, the memory persists in the .feather file on disk — and Claude actively retrieves it at conversation start.

A session with Feather DB MCP looks like this:

User: Remember that I'm building a FastAPI backend for a fintech startup. We use PostgreSQL, async/await throughout, and I prefer detailed code examples over high-level explanations.

[Claude calls feather_add — stores three linked memory nodes with importance=0.9]

Claude: Stored. I'll keep that context for future sessions.

Next conversation, days later:

User: Help me debug a slow database query.

[Claude calls feather_context_chain — ANN search for "database query performance" seeds the graph, BFS traversal surfaces the FastAPI project node and its linked preferences node]

Claude: Based on what I know about your stack — FastAPI, PostgreSQL, async/await — here's what to check first...

No re-explanation. No context preamble. Claude retrieved it.

v0.14.0 MCP Remote Backend: feather-serve over the network

v0.14.0 made feather-serve network-native. You can run the context engine on a remote machine and have Claude connect over the network — same MCP tools, same 14 endpoints, different host.

# Remote server: start feather-serve pointing at Feather Cloud
feather-serve \
  --api-url https://your-cloud.example.com \
  --embed-provider gemini \
  --dim 768 \
  --port 8001

# claude_desktop_config.json — points at the remote server
{
  "mcpServers": {
    "feather-team": {
      "url": "https://your-remote-server.example.com:8001/mcp"
    }
  }
}

This is the pattern for teams: one shared Feather Cloud instance, multiple Claude Desktop users, each connecting via MCP. Shared memory, namespace-isolated per user or project.

v0.15.1 real embedders: --embed-provider

Before v0.15.1, feather_add and feather_search required pre-embedded vectors. Claude can't generate float32 embedding arrays — it would need an intermediate step. v0.15.1 solved this by moving embedding inside feather-serve.

With --embed-provider set, Claude passes raw text to feather_add and feather_search. feather-serve calls the embedding API, gets the vector, and proceeds. The tool schema accepts plain strings. No embedding step is visible anywhere in the MCP tool interface.

What Claude actually sends to feather_search:

{
  "tool": "feather_search",
  "arguments": {
    "query": "what programming language does the user prefer?",
    "k": 5
  }
}

What feather-serve does internally:

  1. Calls the embedding API with "what programming language does the user prefer?"
  2. Gets a 768-dim float32 vector back
  3. Runs ANN search against the HNSW index
  4. Returns ranked results with adaptive decay scores applied

Claude never touches a vector. The memory abstraction is complete.

Why MCP tools are stateless — and why that's the problem

MCP is a protocol for tool calls. Each call is independent: the server receives a request, executes, and returns. There is no concept of session state, accumulated knowledge, or memory of prior calls baked into MCP itself.

This is by design — statelessness makes MCP servers simple to build and reason about. But it creates a gap: if every tool call is independent, how does a Claude session accumulate knowledge over time? How does a Monday conversation inform a Friday one?

It doesn't — unless you add a context engine.

MCP provides the interface. Feather DB provides the state. Every feather_add call writes to the persistent .feather file. Every feather_search call reads from it, with adaptive decay applied so frequently-recalled memories rise and stale ones fade. The MCP protocol is stateless; the context engine behind it is emphatically not.

This is the architectural pairing: MCP is the standardized wire protocol between Claude and the outside world. Feather DB is the memory substrate that gives the outside world continuity. Without a context engine, MCP tools are sophisticated one-shot functions. With one, they become the access layer for a living knowledge store that gets more useful the longer it runs.

Diagram: Claude + MCP + Feather DB

┌─────────────────────────────────────────────────────────┐
│                    Claude Desktop                        │
│                                                         │
│   User input ──► Claude model                           │
│                      │                                  │
│              "I need context"                           │
│                      │                                  │
│              tool_use: feather_search                   │
│                      │                                  │
└──────────────────────┼──────────────────────────────────┘
                       │  MCP (HTTP/SSE)
                       ▼
┌─────────────────────────────────────────────────────────┐
│                   feather-serve                          │
│                  localhost:7700                          │
│                                                         │
│   /mcp ──► tool router                                  │
│                 │                                       │
│         feather_search("user's stack")                  │
│                 │                                       │
│         embed("user's stack")  ◄── embedding API       │
│                 │                                       │
│         HNSW ANN search                                 │
│                 │                                       │
│         adaptive decay scoring                          │
│                 │                                       │
└─────────────────┼───────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────┐
│              claude_memory.feather                       │
│                                                         │
│   [FastAPI project] ──linked_to──► [PostgreSQL pref]   │
│   [async/await pref]               [code style pref]   │
│   [bug #431 session]──resolves──►  [fix: PR #88]       │
│                                                         │
│   Adaptive decay: frequently recalled = stickier        │
└─────────────────────────────────────────────────────────┘

Getting started

# 1. Install
pip install feather-db

# 2. Start the context engine with semantic embeddings
GOOGLE_API_KEY=your-key feather-serve ~/claude_memory.feather \
  --embed-provider gemini --dim 768 --port 7700

# 3. Add to claude_desktop_config.json
# { "mcpServers": { "feather-memory": { "url": "http://localhost:7700/mcp" } } }

# 4. Restart Claude Desktop — done.

MCP gives Claude the interface. Feather DB gives it the memory. The combination is what makes a context engine actually work across sessions.

Install: pip install feather-db · GitHub: github.com/feather-store/feather · Docs: getfeather.store/docs/integrations