Add MCP server, note mutation endpoint, and updated_at tracking (v3.0.0)

New MCP server (mcp/) exposes kb operations as native MCP tools over Streamable HTTP with Bearer token auth. Supports collections via tag conventions, chunked file uploads, and agent-side search patterns. Engine gains PATCH /api/v1/notes/{id} for in-place note updates with transactional re-chunk/re-embed, and updated_at column on documents. Go client adds updatenote command and Patch HTTP method. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 21:34:55 +01:00
parent adeba21712
commit e7136a4a20
32 changed files with 1679 additions and 8 deletions
@@ -0,0 +1,35 @@
+# Agent-Side Search Patterns
+
+## Purpose
+
+Documents recommended patterns for agent-side query expansion and reranking, which are caller responsibilities rather than engine features. These patterns are communicated via MCP tool descriptions.
+
+## Requirements
+
+### Requirement: Query expansion guidance in tool description
+
+The `kb_search` MCP tool description SHALL include guidance on query expansion as a recommended pattern for complex queries.
+
+#### Scenario: Tool description includes expansion pattern
+- **WHEN** an agent reads the `kb_search` tool description
+- **THEN** the description SHALL include guidance such as: "For complex queries, consider expanding into 2-3 variant phrasings and calling this tool multiple times, then deduplicating results by chunk_id"
+
+---
+
+### Requirement: Reranking guidance in tool description
+
+The `kb_search` MCP tool description SHALL include guidance on agent-side reranking as a recommended pattern for improving precision.
+
+#### Scenario: Tool description includes reranking pattern
+- **WHEN** an agent reads the `kb_search` tool description
+- **THEN** the description SHALL include guidance such as: "For precision, rerank the returned results using your own judgement based on relevance to the original question"
+
+---
+
+### Requirement: No engine-side LLM dependency
+
+The engine SHALL NOT require or use any external LLM API for search operations. Query expansion and reranking SHALL remain entirely agent-side concerns.
+
+#### Scenario: Engine has no LLM dependency
+- **WHEN** the engine is deployed without any `ANTHROPIC_API_KEY` or similar LLM API configuration
+- **THEN** all search operations SHALL function fully, with no degraded results or missing features
@@ -0,0 +1,79 @@
+# Engine API (Delta)
+
+## ADDED Requirements
+
+### Requirement: Note mutation endpoint
+
+The engine SHALL provide a `PATCH /api/v1/notes/{id}` endpoint for updating existing notes in place. See the `note-mutation` spec for full details.
+
+#### Scenario: Note update endpoint exists
+- **WHEN** a client sends `PATCH /api/v1/notes/42` with body `{"text": "new content"}`
+- **THEN** the engine SHALL process the update synchronously and return the updated document
+
+---
+
+### Requirement: Document updated_at tracking
+
+The engine SHALL track when documents are modified via an `updated_at` column. This column SHALL be NULL for documents that have never been updated.
+
+#### Scenario: New document has no updated_at
+- **WHEN** a document is first ingested
+- **THEN** `updated_at` SHALL be NULL and `created_at` SHALL be set to the ingestion timestamp
+
+#### Scenario: Note update sets updated_at
+- **WHEN** a note is updated via `PATCH /api/v1/notes/{id}`
+- **THEN** `updated_at` SHALL be set to the current timestamp
+
+#### Scenario: Tag change sets updated_at
+- **WHEN** tags are modified via `PUT /api/v1/documents/{id}/tags`
+- **THEN** `updated_at` SHALL be set to the current timestamp
+
+#### Scenario: Schema migration for updated_at
+- **WHEN** the engine starts against a v2 database without an `updated_at` column
+- **THEN** the engine SHALL automatically add `ALTER TABLE documents ADD COLUMN updated_at TEXT` and all existing documents SHALL have `updated_at = NULL`
+
+## MODIFIED Requirements
+
+### Requirement: Document management
+
+The engine SHALL provide endpoints to list, inspect, remove, and download original files for ingested documents.
+
+#### Scenario: List documents
+- **WHEN** a client sends `GET /api/v1/documents`
+- **THEN** the engine SHALL return a JSON array of documents with id, title, doc_type, tags, chunk_count, created_at, and updated_at
+
+#### Scenario: List documents with filters
+- **WHEN** a client sends `GET /api/v1/documents?type=pdf&tags=manual`
+- **THEN** the engine SHALL return only documents matching all specified filters
+
+#### Scenario: List documents sorted by most recent
+- **WHEN** a client requests documents sorted by date
+- **THEN** the engine SHALL use `COALESCE(updated_at, created_at)` for ordering, so un-mutated documents sort by creation time and mutated documents sort by their last update
+
+#### Scenario: Get document details
+- **WHEN** a client sends `GET /api/v1/documents/{id}`
+- **THEN** the engine SHALL return the full document record including all chunks, their text content, `updated_at`, and whether the original file is available (`has_file: true/false`)
+
+#### Scenario: Download original file
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file`
+- **THEN** the engine SHALL return the original file with appropriate Content-Type and `Content-Disposition: attachment; filename="{original_filename}"` headers, or HTTP 404 if the file is not available
+
+#### Scenario: Remove a document
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}`
+- **THEN** the engine SHALL delete the document, all its chunks, associated embeddings, tag associations, and the stored original file from disk, and return HTTP 200 with a confirmation
+
+#### Scenario: Remove non-existent document
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}` with a non-existent ID
+- **THEN** the engine SHALL return HTTP 404
+
+### Requirement: Engine status and reindex
+
+The engine SHALL provide status information and support re-embedding all chunks. The `version` field in the status response SHALL always be present and SHALL reflect the engine's release version as read from the `VERSION` file. This field is the contract used by clients for compatibility checking.
+
+#### Scenario: Get engine status
+- **WHEN** a client sends `GET /api/v1/status`
+- **THEN** the engine SHALL return JSON with `version` (string, from VERSION file), model_name, embedding_dim, GPU device info, database stats (document count by type, total chunks, DB size), and queue stats (queued/processing job count)
+
+#### Scenario: Trigger reindex
+- **WHEN** a client sends `POST /api/v1/reindex`
+- **THEN** the engine SHALL re-embed all existing chunks using the `enriched_text` column and the currently loaded model, and return progress information. This operation SHALL NOT block search queries.
@@ -0,0 +1,57 @@
+# Go Client (Delta)
+
+## ADDED Requirements
+
+### Requirement: Update note command
+
+The client SHALL provide a `kb updatenote <id> <text>` command that updates an existing note's content via the engine's `PATCH /api/v1/notes/{id}` endpoint.
+
+#### Scenario: Update a note
+- **WHEN** the user runs `kb updatenote 42 "Updated note content"`
+- **THEN** the client SHALL send `PATCH /api/v1/notes/42` with body `{"text": "Updated note content"}` and display the result
+
+#### Scenario: Update a note with JSON output
+- **WHEN** the user runs `kb updatenote 42 "new content" --format json`
+- **THEN** the client SHALL output the raw JSON response from the engine
+
+#### Scenario: Update a non-existent document
+- **WHEN** the user runs `kb updatenote 999 "text"` and the engine returns HTTP 404
+- **THEN** the client SHALL display an error indicating the document was not found and exit with a non-zero code
+
+#### Scenario: Update a non-note document
+- **WHEN** the user runs `kb updatenote 42 "text"` and the engine returns HTTP 422
+- **THEN** the client SHALL display an error indicating that only notes can be updated and exit with a non-zero code
+
+#### Scenario: Missing arguments
+- **WHEN** the user runs `kb updatenote` or `kb updatenote 42` with insufficient arguments
+- **THEN** the client SHALL display usage help indicating that both document ID and text are required
+
+## MODIFIED Requirements
+
+### Requirement: Engine version compatibility check
+
+The client SHALL verify that the connected engine meets a minimum version requirement before executing any API command. The minimum required engine version SHALL be embedded in the client binary at build time. If the engine version is below the minimum, the client SHALL print an error message and exit with a non-zero code. There SHALL be no flag to skip or suppress this check.
+
+#### Scenario: Compatible engine version
+- **WHEN** the client connects to an engine reporting version `3.0.0` and `MinEngineVersion` is `3.0.0`
+- **THEN** the client SHALL proceed with the command normally
+
+#### Scenario: Incompatible engine version
+- **WHEN** the client connects to an engine reporting version `2.1.0` and `MinEngineVersion` is `3.0.0`
+- **THEN** the client SHALL print to stderr: `Error: kb client vX.Y.Z requires engine v3.0.0+ (connected engine is v2.1.0)` followed by an upgrade hint, and exit with code 1
+
+#### Scenario: Engine unreachable during version check
+- **WHEN** the client cannot reach the engine's `/api/v1/status` endpoint
+- **THEN** the client SHALL skip the version check and proceed with the original command (the actual API call will surface the connectivity error)
+
+#### Scenario: Version check is cached per session
+- **WHEN** the client has already verified engine compatibility during the current invocation
+- **THEN** subsequent API calls within the same invocation SHALL NOT repeat the version check
+
+#### Scenario: Client version command does not check engine
+- **WHEN** the user runs `kb --version`
+- **THEN** the client SHALL print the client version without contacting the engine
+
+#### Scenario: MinEngineVersion not set
+- **WHEN** the client binary has `MinEngineVersion` set to empty string or `dev`
+- **THEN** the client SHALL skip the version check entirely (development builds)
@@ -0,0 +1,205 @@
+# MCP Server
+
+## Purpose
+
+The MCP server provides a Model Context Protocol interface to the kb engine, exposing knowledge base operations as native MCP tools over Streamable HTTP transport. It runs as a separate Docker container alongside the engine, translating MCP tool calls into engine HTTP API calls.
+
+## Requirements
+
+### Requirement: MCP server transport and deployment
+
+The MCP server SHALL expose tools via Streamable HTTP transport. It SHALL run as a Docker container, configured to connect to the kb engine's HTTP API. It SHALL read `KB_ENGINE_URL` and `KB_API_KEY` from environment variables to connect to the engine.
+
+#### Scenario: MCP server starts and connects to engine
+- **WHEN** the MCP server container starts with `KB_ENGINE_URL=http://engine:8000` and `KB_API_KEY=secret`
+- **THEN** it SHALL begin accepting MCP connections over Streamable HTTP and use the configured URL and API key for all engine API calls
+
+#### Scenario: Engine unreachable at startup
+- **WHEN** the MCP server starts but cannot reach the engine at `KB_ENGINE_URL`
+- **THEN** it SHALL start and accept connections, but tool calls SHALL return errors indicating the engine is unreachable
+
+#### Scenario: Docker Compose deployment
+- **WHEN** the MCP server is deployed via Docker Compose alongside the engine
+- **THEN** it SHALL connect to the engine via the Docker network using the service name (e.g. `http://engine:8000`)
+
+---
+
+### Requirement: MCP server authentication
+
+The MCP server SHALL require Bearer token authentication from calling agents via the `KB_MCP_API_KEY` environment variable. This is independent of the engine's `KB_API_KEY`.
+
+#### Scenario: Valid MCP API key
+- **WHEN** `KB_MCP_API_KEY` is set and a calling agent provides a matching Bearer token
+- **THEN** the MCP server SHALL process the request normally
+
+#### Scenario: Missing MCP API key when required
+- **WHEN** `KB_MCP_API_KEY` is set and a calling agent connects without a Bearer token
+- **THEN** the MCP server SHALL reject the connection with an authentication error
+
+#### Scenario: Invalid MCP API key
+- **WHEN** `KB_MCP_API_KEY` is set and a calling agent provides a non-matching Bearer token
+- **THEN** the MCP server SHALL reject the connection with an authentication error
+
+#### Scenario: MCP auth disabled
+- **WHEN** `KB_MCP_API_KEY` is not set
+- **THEN** the MCP server SHALL accept all connections without authentication
+
+---
+
+### Requirement: Search tool
+
+The MCP server SHALL expose a `kb_search` tool that queries the knowledge base via the engine's search API.
+
+#### Scenario: Basic search
+- **WHEN** an agent calls `kb_search` with `{"query": "pension revaluation", "top": 5}`
+- **THEN** the MCP server SHALL POST to the engine's `/api/v1/search` endpoint and return the results with chunk text, scores, document metadata, and tags
+
+#### Scenario: Search with collection filter
+- **WHEN** an agent calls `kb_search` with `{"query": "email preferences", "collection": "memory"}`
+- **THEN** the MCP server SHALL add `collection:memory` to the tags filter and POST to the engine's search endpoint
+
+#### Scenario: Search with tags and collection
+- **WHEN** an agent calls `kb_search` with `{"query": "feedback", "tags": ["email"], "collection": "memory"}`
+- **THEN** the MCP server SHALL combine the explicit tags with `collection:memory` in the tag filter
+
+#### Scenario: Search results strip collection tags
+- **WHEN** the engine returns search results containing tags `["collection:memory", "feedback", "email"]`
+- **THEN** the MCP server SHALL strip `collection:*` tags from the `tags` array and add a separate `collection` field, returning `{"collection": "memory", "tags": ["feedback", "email"], ...}`
+
+#### Scenario: Search with mode override
+- **WHEN** an agent calls `kb_search` with `{"query": "error log", "fts_only": true}`
+- **THEN** the MCP server SHALL pass `fts_only: true` to the engine search endpoint
+
+---
+
+### Requirement: Add note tool
+
+The MCP server SHALL expose a `kb_addnote` tool that submits a text note to the engine for ingestion.
+
+#### Scenario: Add a note with default collection
+- **WHEN** an agent calls `kb_addnote` with `{"text": "User prefers concise responses"}`
+- **THEN** the MCP server SHALL submit the note to the engine's `POST /api/v1/jobs` endpoint with the tag `collection:documents` and return the job ID
+
+#### Scenario: Add a note to a specific collection
+- **WHEN** an agent calls `kb_addnote` with `{"text": "User prefers concise responses", "collection": "memory", "tags": ["feedback"]}`
+- **THEN** the MCP server SHALL submit the note with tags `["collection:memory", "feedback"]` to the engine
+
+#### Scenario: Add a note to a collection replaces existing collection tag
+- **WHEN** an agent calls `kb_addnote` with `{"text": "some note", "collection": "memory"}` and the note is ingested
+- **THEN** the resulting document SHALL have exactly one `collection:*` tag: `collection:memory`
+
+---
+
+### Requirement: Chunked file upload tools
+
+The MCP server SHALL expose a three-step chunked file upload pattern for transferring files from remote agents to the engine.
+
+#### Scenario: Start an upload
+- **WHEN** an agent calls `kb_upload_start` with `{"filename": "report.pdf", "total_size": 5242880, "tags": ["insurance"], "collection": "documents"}`
+- **THEN** the MCP server SHALL create a staging entry, generate a UUID `upload_id`, and return `{"upload_id": "<uuid>"}`
+
+#### Scenario: Upload a chunk
+- **WHEN** an agent calls `kb_upload_chunk` with `{"upload_id": "<uuid>", "data": "<base64-encoded-data>", "chunk_index": 0}`
+- **THEN** the MCP server SHALL decode the base64 data and write it to the staging area for the given upload
+
+#### Scenario: Upload multiple chunks in sequence
+- **WHEN** an agent calls `kb_upload_chunk` multiple times with sequential `chunk_index` values for the same `upload_id`
+- **THEN** the MCP server SHALL store each chunk and track the sequence
+
+#### Scenario: Finish an upload
+- **WHEN** an agent calls `kb_upload_finish` with `{"upload_id": "<uuid>"}`
+- **THEN** the MCP server SHALL reassemble the chunks in order, forward the complete file as a multipart upload to the engine's `POST /api/v1/jobs` endpoint with the tags from `kb_upload_start` (including `collection:<name>`), and return the job ID
+
+#### Scenario: Upload with invalid upload_id
+- **WHEN** an agent calls `kb_upload_chunk` or `kb_upload_finish` with an `upload_id` that does not exist
+- **THEN** the MCP server SHALL return an error indicating the upload ID is not found
+
+#### Scenario: Abandoned upload cleanup
+- **WHEN** an agent starts an upload but does not call `kb_upload_finish` within 10 minutes
+- **THEN** the MCP server SHALL clean up the staged chunks and remove the upload tracking entry
+
+#### Scenario: MCP server restart during upload
+- **WHEN** the MCP server container restarts while an upload is in progress
+- **THEN** the in-progress upload SHALL be lost and the agent SHALL need to restart from `kb_upload_start`
+
+---
+
+### Requirement: Update note tool
+
+The MCP server SHALL expose a `kb_update_note` tool that updates an existing note in place via the engine's note mutation endpoint.
+
+#### Scenario: Update an existing note
+- **WHEN** an agent calls `kb_update_note` with `{"document_id": 42, "text": "Updated preference: user prefers bullet points"}`
+- **THEN** the MCP server SHALL send `PATCH /api/v1/notes/42` to the engine and return the updated document
+
+#### Scenario: Update a non-existent document
+- **WHEN** an agent calls `kb_update_note` with a `document_id` that does not exist
+- **THEN** the MCP server SHALL return an error indicating the document was not found
+
+#### Scenario: Update a non-note document
+- **WHEN** an agent calls `kb_update_note` with a `document_id` that refers to a PDF
+- **THEN** the MCP server SHALL return an error indicating that only notes can be updated
+
+---
+
+### Requirement: Get document tool
+
+The MCP server SHALL expose a `kb_get` tool that retrieves document details from the engine.
+
+#### Scenario: Get by document ID
+- **WHEN** an agent calls `kb_get` with `{"document_id": 42}`
+- **THEN** the MCP server SHALL fetch `GET /api/v1/documents/42` and return the document details with chunks
+
+#### Scenario: Get by source path
+- **WHEN** an agent calls `kb_get` with `{"source_path": "memory/feedback_testing.md"}`
+- **THEN** the MCP server SHALL query the engine's documents endpoint filtered by source path and return matching documents
+
+#### Scenario: Get results strip collection tags
+- **WHEN** the engine returns document details with tags including `collection:memory`
+- **THEN** the MCP server SHALL strip `collection:*` from tags and present a separate `collection` field
+
+---
+
+### Requirement: Status tool
+
+The MCP server SHALL expose a `kb_status` tool that returns engine health and statistics.
+
+#### Scenario: Get engine status
+- **WHEN** an agent calls `kb_status` with no parameters
+- **THEN** the MCP server SHALL fetch `GET /api/v1/status` and return engine version, model info, device info, document counts, and queue state
+
+---
+
+### Requirement: Jobs tool
+
+The MCP server SHALL expose a `kb_jobs` tool that returns ingestion job status.
+
+#### Scenario: List recent jobs
+- **WHEN** an agent calls `kb_jobs` with no parameters
+- **THEN** the MCP server SHALL fetch `GET /api/v1/jobs` and return the list of recent jobs
+
+#### Scenario: Filter jobs by status
+- **WHEN** an agent calls `kb_jobs` with `{"status": "failed"}`
+- **THEN** the MCP server SHALL fetch `GET /api/v1/jobs?status=failed` and return matching jobs
+
+---
+
+### Requirement: Collection management via tags
+
+The MCP server SHALL manage collections using tag conventions. The MCP server SHALL enforce exclusive collection membership — a document SHALL belong to exactly one collection.
+
+#### Scenario: Default collection on addnote
+- **WHEN** an agent calls `kb_addnote` without specifying a collection
+- **THEN** the MCP server SHALL apply the tag `collection:documents`
+
+#### Scenario: Explicit collection on addnote
+- **WHEN** an agent calls `kb_addnote` with `{"collection": "memory"}`
+- **THEN** the MCP server SHALL apply the tag `collection:memory`
+
+#### Scenario: Exclusive collection enforcement
+- **WHEN** a document already has the tag `collection:documents` and an operation changes its collection to `memory`
+- **THEN** the MCP server SHALL first remove `collection:documents` via the engine's tag API, then add `collection:memory`
+
+#### Scenario: Collection field in search results
+- **WHEN** search results include documents with `collection:*` tags
+- **THEN** the MCP server SHALL present the collection as a top-level `collection` field and exclude `collection:*` from the `tags` array
@@ -0,0 +1,43 @@
+# Note Mutation
+
+## Purpose
+
+Note mutation allows existing notes to be updated in place without requiring delete and re-add, preserving document identity (ID, creation timestamp) while updating content, embeddings, and the full-text index.
+
+## Requirements
+
+### Requirement: Note update endpoint
+
+The engine SHALL provide a `PATCH /api/v1/notes/{id}` endpoint that accepts new text for an existing note, re-chunks and re-embeds it, and returns the updated document.
+
+#### Scenario: Update an existing note
+- **WHEN** a client sends `PATCH /api/v1/notes/42` with body `{"text": "Updated note content"}`
+- **THEN** the engine SHALL delete existing chunks and embeddings for document 42, run the new text through the note chunking pipeline, generate embeddings for each chunk, insert new chunks and embeddings, update the document's `content_hash` and `updated_at`, and return the updated document with HTTP 200
+
+#### Scenario: Update preserves document identity
+- **WHEN** a note is updated via PATCH
+- **THEN** the document SHALL retain its original `id` and `created_at` values, and `updated_at` SHALL be set to the current timestamp
+
+#### Scenario: Update with long text that produces multiple chunks
+- **WHEN** a client sends `PATCH /api/v1/notes/42` with text longer than the embedding model's token window
+- **THEN** the engine SHALL chunk the text using the same note chunking pipeline as ingestion, producing multiple chunks, and embed each chunk separately
+
+#### Scenario: Update a non-existent document
+- **WHEN** a client sends `PATCH /api/v1/notes/999` and document 999 does not exist
+- **THEN** the engine SHALL return HTTP 404
+
+#### Scenario: Update a non-note document
+- **WHEN** a client sends `PATCH /api/v1/notes/42` and document 42 has `doc_type = 'pdf'`
+- **THEN** the engine SHALL return HTTP 422 with an error indicating that only notes can be updated via this endpoint
+
+#### Scenario: Embedding failure during update
+- **WHEN** a client sends `PATCH /api/v1/notes/42` but the embedding step fails
+- **THEN** the engine SHALL roll back the entire transaction, preserving the original note content, chunks, and embeddings, and return HTTP 500
+
+#### Scenario: FTS5 index updated on note mutation
+- **WHEN** a note is updated via PATCH
+- **THEN** the FTS5 virtual table SHALL be updated via the existing chunk triggers (`chunks_ad` for deletes, `chunks_ai` for inserts), keeping the full-text index consistent with the new content
+
+#### Scenario: Tags preserved on update
+- **WHEN** a note with tags `["feedback", "collection:memory"]` is updated via PATCH
+- **THEN** the document's tags SHALL be unchanged — only the text content, chunks, and embeddings are replaced