kb/openspec/changes/archive/2026-04-04-kb-v3-mcp-server/proposal.md at 75e4a0cf730969bdae711a42e54a4b9a533b8126

steve/kb

Fork 0

Files

T

steve 223ff2cf5d Latest changes all archived

2026-04-04 22:50:19 +01:00

5.9 KiB

Raw Blame History

Why

The kb engine exposes a well-structured REST API, but agent integration today goes through a Claude Code skill that shells out to the Go CLI binary, parses JSON output, and re-synthesises results. This works but is indirect: subprocess overhead on every call, fragile output parsing, no streaming, and no composability with other MCP tools in the same session. As agents increasingly rely on kb for both document retrieval and memory storage, this friction compounds.

At the same time, there is no way to scope searches to "agent memory" vs "user documents" without careful manual tagging, and no way to update an existing note in place without delete + re-add. These gaps cause agents to accumulate stale duplicates and pollute the user's document index with internal memory notes.

kb v3 adds an MCP server as a new integration surface alongside the existing CLI, establishes collection tag conventions for scoped search, and adds note mutation to support the agent memory use case natively.

What Changes

1. MCP Server (new component)

A Model Context Protocol server that exposes kb operations as native MCP tools. Runs as a separate Docker container alongside the engine, using Streamable HTTP transport. Translates MCP tool calls into engine HTTP API calls.

MCP tool surface:

MCP Tool	Maps to Engine API	Notes
`kb_search`	`POST /api/v1/search`	Query, top_n, tags, doc_type, collection, mode
`kb_addnote`	`POST /api/v1/jobs`	Body text, tags, collection (default: `documents`)
`kb_upload_start`	(MCP server internal)	Start chunked upload: filename, size, tags, collection → returns upload_id
`kb_upload_chunk`	(MCP server internal)	Append base64 chunk to staging: upload_id, data, chunk_index
`kb_upload_finish`	`POST /api/v1/jobs`	Reassemble chunks, decode, proxy as multipart upload → returns job_id
`kb_update_note`	`PATCH /api/v1/notes/{id}`	Replace note text, re-chunk and re-embed in place
`kb_get`	`GET /api/v1/documents`	Retrieve by document ID or source_path
`kb_status`	`GET /api/v1/status`	Index health, doc counts, model info, queue state
`kb_jobs`	`GET /api/v1/jobs`	Check ingestion queue status

The collection parameter on search/addnote/addfile is translated by the MCP server into tag filters using the convention collection:<name> (e.g. collection:memory). No engine changes required for collections.

2. Collection Tag Conventions (no engine changes)

Scoped document organisation using existing tags with a naming convention.

Convention: collection:documents (default), collection:memory, collection:workspace
MCP tools accept a collection parameter and translate to tag operations
The Go CLI can use the same convention via --tags collection:memory
No new schema, no new API parameters on the engine — uses existing tag infrastructure

3. Note Mutation (engine extension)

Allow existing notes to be updated in place without delete + re-add.

PATCH /api/v1/notes/{id} endpoint — accepts new text, re-chunks and re-embeds
Preserves original created_at, updates updated_at
kb updatenote <id> "new text" CLI command
kb_update_note MCP tool

4. Agent-Side Search Patterns (no engine changes)

Query expansion and reranking are caller responsibilities, not engine features. The calling agent already has an LLM — adding one inside the engine would duplicate capability, introduce a cloud API dependency into a fully local system, and complicate testing.

Query expansion — the agent expands its query into 2-3 variant phrasings, makes multiple kb_search calls, and merges/deduplicates results in its own context. The MCP tool descriptions should document this as a recommended pattern for complex natural-language questions.

Reranking — the agent reads the top N search results and applies its own judgement to reorder by relevance. This is what agents already do when synthesising answers from retrieved chunks.

These patterns should be documented in the MCP tool descriptions and the kb skill guidance, not implemented as engine features.

Capabilities

New Capabilities

mcp-server: MCP protocol server exposing kb tools (search, addnote, chunked file upload, update_note, get, status, jobs) for native agent integration. Runs as a Docker container with Streamable HTTP transport. Calls engine HTTP API internally. File uploads use a three-step chunked pattern (start → chunk × N → finish) to avoid message size limits, then proxy to the engine's existing upload endpoint.
note-mutation: In-place update of existing notes. New PATCH endpoint re-chunks and re-embeds while preserving document identity and creation timestamp.
agent-search-patterns: Documented patterns for agent-side query expansion (multi-query + merge) and reranking (LLM-based result reordering). No engine changes — these are caller responsibilities, documented in MCP tool descriptions and skill guidance.

Modified Capabilities

engine-api: New endpoint for note mutation (PATCH /api/v1/notes/{id}). documents table gains updated_at column.
go-client: New updatenote command.

Impact

Code — new: mcp/ directory — MCP server package. Thin adapter translating MCP tool calls to engine HTTP API calls, with base64 file upload decoding.
Code — engine: kb/database.py — add updated_at column, migration logic. New kb/routes/notes.py for PATCH endpoint.
Code — client: New cmd/updatenote.go. internal/api/client.go for new endpoint.
APIs: New PATCH /api/v1/notes/{id}.
Dependencies: MCP Python SDK (mcp package) and httpx for the MCP server.
Systems: MCP server added to Docker Compose stack. Agents connect to it via Streamable HTTP.
Data: SQLite schema migration — updated_at TEXT column on documents table. Non-destructive.
Versioning: Engine bumps to v3.0.0 (new endpoint + schema). Client bumps to v3.0.0 (new command). MIN_ENGINE_VERSION updated to v3.0.0.

5.9 KiB Raw Blame History Unescape Escape

Why