Files
kb/openspec/changes/archive/2026-04-04-kb-v3-mcp-server/proposal.md
T
2026-04-04 22:50:19 +01:00

5.9 KiB
Raw Blame History

Why

The kb engine exposes a well-structured REST API, but agent integration today goes through a Claude Code skill that shells out to the Go CLI binary, parses JSON output, and re-synthesises results. This works but is indirect: subprocess overhead on every call, fragile output parsing, no streaming, and no composability with other MCP tools in the same session. As agents increasingly rely on kb for both document retrieval and memory storage, this friction compounds.

At the same time, there is no way to scope searches to "agent memory" vs "user documents" without careful manual tagging, and no way to update an existing note in place without delete + re-add. These gaps cause agents to accumulate stale duplicates and pollute the user's document index with internal memory notes.

kb v3 adds an MCP server as a new integration surface alongside the existing CLI, establishes collection tag conventions for scoped search, and adds note mutation to support the agent memory use case natively.

What Changes

1. MCP Server (new component)

A Model Context Protocol server that exposes kb operations as native MCP tools. Runs as a separate Docker container alongside the engine, using Streamable HTTP transport. Translates MCP tool calls into engine HTTP API calls.

MCP tool surface:

MCP Tool Maps to Engine API Notes
kb_search POST /api/v1/search Query, top_n, tags, doc_type, collection, mode
kb_addnote POST /api/v1/jobs Body text, tags, collection (default: documents)
kb_upload_start (MCP server internal) Start chunked upload: filename, size, tags, collection → returns upload_id
kb_upload_chunk (MCP server internal) Append base64 chunk to staging: upload_id, data, chunk_index
kb_upload_finish POST /api/v1/jobs Reassemble chunks, decode, proxy as multipart upload → returns job_id
kb_update_note PATCH /api/v1/notes/{id} Replace note text, re-chunk and re-embed in place
kb_get GET /api/v1/documents Retrieve by document ID or source_path
kb_status GET /api/v1/status Index health, doc counts, model info, queue state
kb_jobs GET /api/v1/jobs Check ingestion queue status

The collection parameter on search/addnote/addfile is translated by the MCP server into tag filters using the convention collection:<name> (e.g. collection:memory). No engine changes required for collections.

2. Collection Tag Conventions (no engine changes)

Scoped document organisation using existing tags with a naming convention.

  • Convention: collection:documents (default), collection:memory, collection:workspace
  • MCP tools accept a collection parameter and translate to tag operations
  • The Go CLI can use the same convention via --tags collection:memory
  • No new schema, no new API parameters on the engine — uses existing tag infrastructure

3. Note Mutation (engine extension)

Allow existing notes to be updated in place without delete + re-add.

  • PATCH /api/v1/notes/{id} endpoint — accepts new text, re-chunks and re-embeds
  • Preserves original created_at, updates updated_at
  • kb updatenote <id> "new text" CLI command
  • kb_update_note MCP tool

4. Agent-Side Search Patterns (no engine changes)

Query expansion and reranking are caller responsibilities, not engine features. The calling agent already has an LLM — adding one inside the engine would duplicate capability, introduce a cloud API dependency into a fully local system, and complicate testing.

Query expansion — the agent expands its query into 2-3 variant phrasings, makes multiple kb_search calls, and merges/deduplicates results in its own context. The MCP tool descriptions should document this as a recommended pattern for complex natural-language questions.

Reranking — the agent reads the top N search results and applies its own judgement to reorder by relevance. This is what agents already do when synthesising answers from retrieved chunks.

These patterns should be documented in the MCP tool descriptions and the kb skill guidance, not implemented as engine features.

Capabilities

New Capabilities

  • mcp-server: MCP protocol server exposing kb tools (search, addnote, chunked file upload, update_note, get, status, jobs) for native agent integration. Runs as a Docker container with Streamable HTTP transport. Calls engine HTTP API internally. File uploads use a three-step chunked pattern (start → chunk × N → finish) to avoid message size limits, then proxy to the engine's existing upload endpoint.
  • note-mutation: In-place update of existing notes. New PATCH endpoint re-chunks and re-embeds while preserving document identity and creation timestamp.
  • agent-search-patterns: Documented patterns for agent-side query expansion (multi-query + merge) and reranking (LLM-based result reordering). No engine changes — these are caller responsibilities, documented in MCP tool descriptions and skill guidance.

Modified Capabilities

  • engine-api: New endpoint for note mutation (PATCH /api/v1/notes/{id}). documents table gains updated_at column.
  • go-client: New updatenote command.

Impact

  • Code — new: mcp/ directory — MCP server package. Thin adapter translating MCP tool calls to engine HTTP API calls, with base64 file upload decoding.
  • Code — engine: kb/database.py — add updated_at column, migration logic. New kb/routes/notes.py for PATCH endpoint.
  • Code — client: New cmd/updatenote.go. internal/api/client.go for new endpoint.
  • APIs: New PATCH /api/v1/notes/{id}.
  • Dependencies: MCP Python SDK (mcp package) and httpx for the MCP server.
  • Systems: MCP server added to Docker Compose stack. Agents connect to it via Streamable HTTP.
  • Data: SQLite schema migration — updated_at TEXT column on documents table. Non-destructive.
  • Versioning: Engine bumps to v3.0.0 (new endpoint + schema). Client bumps to v3.0.0 (new command). MIN_ENGINE_VERSION updated to v3.0.0.