5.9 KiB
Why
The kb engine exposes a well-structured REST API, but agent integration today goes through a Claude Code skill that shells out to the Go CLI binary, parses JSON output, and re-synthesises results. This works but is indirect: subprocess overhead on every call, fragile output parsing, no streaming, and no composability with other MCP tools in the same session. As agents increasingly rely on kb for both document retrieval and memory storage, this friction compounds.
At the same time, there is no way to scope searches to "agent memory" vs "user documents" without careful manual tagging, and no way to update an existing note in place without delete + re-add. These gaps cause agents to accumulate stale duplicates and pollute the user's document index with internal memory notes.
kb v3 adds an MCP server as a new integration surface alongside the existing CLI, establishes collection tag conventions for scoped search, and adds note mutation to support the agent memory use case natively.
What Changes
1. MCP Server (new component)
A Model Context Protocol server that exposes kb operations as native MCP tools. Runs as a separate Docker container alongside the engine, using Streamable HTTP transport. Translates MCP tool calls into engine HTTP API calls.
MCP tool surface:
| MCP Tool | Maps to Engine API | Notes |
|---|---|---|
kb_search |
POST /api/v1/search |
Query, top_n, tags, doc_type, collection, mode |
kb_addnote |
POST /api/v1/jobs |
Body text, tags, collection (default: documents) |
kb_upload_start |
(MCP server internal) | Start chunked upload: filename, size, tags, collection → returns upload_id |
kb_upload_chunk |
(MCP server internal) | Append base64 chunk to staging: upload_id, data, chunk_index |
kb_upload_finish |
POST /api/v1/jobs |
Reassemble chunks, decode, proxy as multipart upload → returns job_id |
kb_update_note |
PATCH /api/v1/notes/{id} |
Replace note text, re-chunk and re-embed in place |
kb_get |
GET /api/v1/documents |
Retrieve by document ID or source_path |
kb_status |
GET /api/v1/status |
Index health, doc counts, model info, queue state |
kb_jobs |
GET /api/v1/jobs |
Check ingestion queue status |
The collection parameter on search/addnote/addfile is translated by the MCP server into tag filters using the convention collection:<name> (e.g. collection:memory). No engine changes required for collections.
2. Collection Tag Conventions (no engine changes)
Scoped document organisation using existing tags with a naming convention.
- Convention:
collection:documents(default),collection:memory,collection:workspace - MCP tools accept a
collectionparameter and translate to tag operations - The Go CLI can use the same convention via
--tags collection:memory - No new schema, no new API parameters on the engine — uses existing tag infrastructure
3. Note Mutation (engine extension)
Allow existing notes to be updated in place without delete + re-add.
PATCH /api/v1/notes/{id}endpoint — accepts new text, re-chunks and re-embeds- Preserves original
created_at, updatesupdated_at kb updatenote <id> "new text"CLI commandkb_update_noteMCP tool
4. Agent-Side Search Patterns (no engine changes)
Query expansion and reranking are caller responsibilities, not engine features. The calling agent already has an LLM — adding one inside the engine would duplicate capability, introduce a cloud API dependency into a fully local system, and complicate testing.
Query expansion — the agent expands its query into 2-3 variant phrasings, makes multiple kb_search calls, and merges/deduplicates results in its own context. The MCP tool descriptions should document this as a recommended pattern for complex natural-language questions.
Reranking — the agent reads the top N search results and applies its own judgement to reorder by relevance. This is what agents already do when synthesising answers from retrieved chunks.
These patterns should be documented in the MCP tool descriptions and the kb skill guidance, not implemented as engine features.
Capabilities
New Capabilities
mcp-server: MCP protocol server exposing kb tools (search, addnote, chunked file upload, update_note, get, status, jobs) for native agent integration. Runs as a Docker container with Streamable HTTP transport. Calls engine HTTP API internally. File uploads use a three-step chunked pattern (start → chunk × N → finish) to avoid message size limits, then proxy to the engine's existing upload endpoint.note-mutation: In-place update of existing notes. New PATCH endpoint re-chunks and re-embeds while preserving document identity and creation timestamp.agent-search-patterns: Documented patterns for agent-side query expansion (multi-query + merge) and reranking (LLM-based result reordering). No engine changes — these are caller responsibilities, documented in MCP tool descriptions and skill guidance.
Modified Capabilities
engine-api: New endpoint for note mutation (PATCH /api/v1/notes/{id}).documentstable gainsupdated_atcolumn.go-client: Newupdatenotecommand.
Impact
- Code — new:
mcp/directory — MCP server package. Thin adapter translating MCP tool calls to engine HTTP API calls, with base64 file upload decoding. - Code — engine:
kb/database.py— addupdated_atcolumn, migration logic. Newkb/routes/notes.pyfor PATCH endpoint. - Code — client: New
cmd/updatenote.go.internal/api/client.gofor new endpoint. - APIs: New
PATCH /api/v1/notes/{id}. - Dependencies: MCP Python SDK (
mcppackage) andhttpxfor the MCP server. - Systems: MCP server added to Docker Compose stack. Agents connect to it via Streamable HTTP.
- Data: SQLite schema migration —
updated_at TEXTcolumn ondocumentstable. Non-destructive. - Versioning: Engine bumps to v3.0.0 (new endpoint + schema). Client bumps to v3.0.0 (new command). MIN_ENGINE_VERSION updated to v3.0.0.