Add MCP auth status to kb_status and update server instructions

- kb_status now returns authenticated: true/false so clients can verify auth - Server instructions mention Bearer token auth requirement - Add .env, .venv/, test_mcp_client.py to .gitignore Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split MCP docs into MCP.md with AI tool setup examples
2026-04-04 12:04:12 +01:00 · 2026-04-02 22:03:41 +01:00 · 2026-04-02 21:52:24 +01:00 · 2026-04-02 21:45:31 +01:00 · 2026-04-02 21:34:55 +01:00 · 2026-04-02 16:18:06 +01:00
42 changed files with 2152 additions and 95 deletions
@@ -1,3 +1,9 @@
 examples/
 .claude/
-__pycache__/
+__pycache__/
 engine/data/
 TMP/
 .env
 .venv/
 test_mcp_client.py
@@ -0,0 +1,174 @@
 # MCP Server (Agent Integration)
 The MCP server exposes kb operations as native MCP tools, so agents can search, add notes, upload files, and manage documents without shelling out to the CLI.
 ## Start the MCP server
 The compose files include a `kb-mcp` service alongside the engine. Set `KB_MCP_API_KEY` to require Bearer token auth from connecting agents:
 ```bash
 KB_API_KEY=your-engine-key KB_MCP_API_KEY=your-agent-key \
  docker compose -f engine/compose.nvidia.yaml up -d
 ```
 Or run the MCP server standalone:
 ```bash
 docker run -d --name kb-mcp \
  -p 3000:3000 \
  -e KB_ENGINE_URL=http://your-engine-host:8000 \
  -e KB_API_KEY=your-engine-key \
  -e KB_MCP_API_KEY=your-agent-key \
  --restart unless-stopped \
  docker.dcglab.co.uk/dcg/kb/mcp:latest
 ```
 ## MCP tools
 | Tool | Description |
 |---|---|
 | `kb_search` | Hybrid search with optional collection/tag/type filters |
 | `kb_addnote` | Add a text note (queued for async ingestion) |
 | `kb_update_note` | Update an existing note in place |
 | `kb_get` | Get document details by ID or source path |
 | `kb_status` | Engine health and statistics |
 | `kb_jobs` | Ingestion queue status |
 | `kb_upload_start` | Start a chunked file upload |
 | `kb_upload_chunk` | Upload a base64-encoded file chunk |
 | `kb_upload_finish` | Finish upload and submit for ingestion |
 ## Collections
 The MCP server supports **collections** — scoped document namespaces implemented via tag conventions. Use these to separate agent memory from user documents:
 - `documents` (default) — user-facing documents
 - `memory` — agent memory and preferences
 - `workspace` — working context
 Tools accept a `collection` parameter. The MCP server translates this to `collection:<name>` tags on the engine, and strips them from responses so agents see a clean `"collection": "memory"` field.
 ## MCP server configuration
 | Variable | Default | Description |
 |---|---|---|
 | `KB_ENGINE_URL` | `http://localhost:8000` | Engine API URL |
 | `KB_API_KEY` | (none) | Engine API key |
 | `KB_MCP_API_KEY` | (none) | Bearer token required from agents (disabled if unset) |
 | `KB_MCP_PORT` | `3000` | Port to listen on |
 ## Connecting AI coding tools
 The kb MCP server uses **Streamable HTTP** transport at `http://your-host:3000/mcp`. Below are configuration examples for popular AI coding tools.
 ### Claude Code (CLI / Desktop / Web)
 Add the server to your project or user settings:
 ```bash
 claude mcp add kb-server --transport http http://localhost:3000/mcp
 ```
 Or add it manually to `.claude/settings.json` (project) or `~/.claude/settings.json` (global):
 ```json
 {
  "mcpServers": {
    "kb-server": {
      "type": "http",
      "url": "http://localhost:3000/mcp",
      "headers": {
        "Authorization": "Bearer your-agent-key"
      }
    }
  }
 }
 ```
 ### VS Code (GitHub Copilot)
 Add to your `.vscode/settings.json` (workspace) or user settings:
 ```json
 {
  "mcp": {
    "servers": {
      "kb-server": {
        "type": "http",
        "url": "http://localhost:3000/mcp",
        "headers": {
          "Authorization": "Bearer your-agent-key"
        }
      }
    }
  }
 }
 ```
 Or add to `.vscode/mcp.json` in your workspace:
 ```json
 {
  "servers": {
    "kb-server": {
      "type": "http",
      "url": "http://localhost:3000/mcp",
      "headers": {
        "Authorization": "Bearer your-agent-key"
      }
    }
  }
 }
 ```
 ### Cursor
 Add to `.cursor/mcp.json` in your project root:
 ```json
 {
  "mcpServers": {
    "kb-server": {
      "type": "streamable-http",
      "url": "http://localhost:3000/mcp",
      "headers": {
        "Authorization": "Bearer your-agent-key"
      }
    }
  }
 }
 ```
 ### Windsurf
 Add to `~/.codeium/windsurf/mcp_config.json`:
 ```json
 {
  "mcpServers": {
    "kb-server": {
      "serverUrl": "http://localhost:3000/mcp",
      "headers": {
        "Authorization": "Bearer your-agent-key"
      }
    }
  }
 }
 ```
 ### JetBrains IDEs (IntelliJ, WebStorm, PyCharm, etc.)
 Add to `.junie/mcp.json` in your project root, or configure via **Settings > Tools > AI Assistant > MCP Servers**:
 ```json
 {
  "servers": {
    "kb-server": {
      "type": "http",
      "url": "http://localhost:3000/mcp",
      "headers": {
        "Authorization": "Bearer your-agent-key"
      }
    }
  }
 }
 ```
@@ -2,16 +2,19 @@
 Personal knowledge base with hybrid search (full-text + semantic vector search).
-v2 uses a client-server architecture: a **FastAPI engine** running in Docker (with GPU acceleration) and a lightweight **Go CLI client** that talks to it over HTTP.
+Client-server architecture: a **FastAPI engine** running in Docker (with optional GPU acceleration), a lightweight **Go CLI client**, and an **MCP server** for native agent integration.
 ## Architecture
 ```
 Go CLI (kb) ──HTTP──▶ FastAPI Engine (Docker) ──▶ SQLite + GPU
                            ▲
 MCP Agents  ──MCP/HTTP──▶ MCP Server (Docker) ──┘
 ```
- **Engine**: Keeps the embedding model warm in GPU memory. Handles search, ingestion, and document management via REST API. Runs in Docker with NVIDIA or AMD GPU support.
+- **Engine**: Keeps the embedding model warm in memory. Handles search, ingestion, document management, and note mutation via REST API. Runs in Docker with NVIDIA GPU, AMD GPU (ROCm), or CPU-only support.
 - **Client**: Single static Go binary. No Python, no ML dependencies, instant startup. Talks to the engine over HTTP.
 - **MCP Server**: Exposes kb operations as native MCP tools over Streamable HTTP. Runs as a separate Docker container alongside the engine. Supports collections for scoping agent memory vs user documents.
 - **Storage**: Single SQLite database with FTS5 (keyword search) and sqlite-vec (vector search). Portable via bind mount — just copy the data directory between hosts.
 ## Quick start
@@ -43,49 +46,33 @@ docker run -d --name kb-engine \
  -e KB_API_KEY=your-secret-key \
  --restart unless-stopped \
  docker.dcglab.co.uk/dcg/kb/engine:latest-rocm
 # CPU only (no GPU required — smaller image)
 docker run -d --name kb-engine \
  -p 8000:8000 \
  -v ~/kb-data:/data \
  -e KB_MODEL=all-MiniLM-L6-v2 \
  -e KB_API_KEY=your-secret-key \
  --restart unless-stopped \
  docker.dcglab.co.uk/dcg/kb/engine:latest-cpu
 ```
-Or use a compose file — create `compose.yaml`:
+Or use a compose file from the repo:
 ```yaml
 services:
  kb-engine:
    image: docker.dcglab.co.uk/dcg/kb/engine:latest-nvidia  # or latest-rocm
    runtime: nvidia  # remove for ROCm
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    # For ROCm, replace the above runtime/deploy block with:
    # devices:
    #   - "/dev/kfd"
    #   - "/dev/dri"
    # group_add:
    #   - "video"
    ports:
      - "${KB_PORT:-8000}:8000"
    volumes:
      - ${KB_DATA_PATH:-./data}:/data
    environment:
      - KB_MODEL=${KB_MODEL:-all-MiniLM-L6-v2}
      - KB_DEVICE=${KB_DEVICE:-auto}
      - KB_INGEST_DEVICE=${KB_INGEST_DEVICE:-auto}
      - KB_API_KEY=${KB_API_KEY:-}
      - KB_SEARCH_THRESHOLD=${KB_SEARCH_THRESHOLD:-0.01}
      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-}
    restart: unless-stopped
 ```
 ```bash
-KB_DATA_PATH=~/kb-data docker compose up -d
+# NVIDIA GPU
 KB_DATA_PATH=~/kb-data docker compose -f engine/compose.nvidia.yaml up -d
 # AMD GPU (ROCm)
 KB_DATA_PATH=~/kb-data docker compose -f engine/compose.rocm.yaml up -d
 # CPU only
 KB_DATA_PATH=~/kb-data docker compose -f engine/compose.cpu.yaml up -d
 ```
 See [DEVELOPER.md](DEVELOPER.md) to run the engine from source.
-The engine will download the embedding model on first start (~90MB) and load it onto the GPU. Check readiness:
+The engine will download the embedding model on first start (~90MB) and load it into memory (GPU or CPU). Check readiness:
 ```bash
 curl http://localhost:8000/api/v1/health
@@ -100,7 +87,7 @@ Check [releases](https://gitea.dcglab.co.uk/steve/kb/releases) for the latest cl
 ```bash
 # Set the version tag
-TAG=client-v2.1.0
+TAG=client-v3.0.0
 # Linux (amd64)
 curl -L -o kb https://gitea.dcglab.co.uk/steve/kb/releases/download/${TAG}/kb-linux-amd64
@@ -151,6 +138,9 @@ kb jobs
 kb search "how to install git"
 kb search "deploy process" --tags ops --type pdf
 # Update a note in place
 kb updatenote 42 "revised note content"
 # Manage
 kb list
 kb info 1
@@ -196,8 +186,14 @@ rsync -a ~/kb-data/ user@target:/home/user/kb-data/
 KB_DATA_PATH=~/kb-data docker compose -f compose.nvidia.yaml up -d
 ```
-Data is GPU-vendor-agnostic — you can ingest on NVIDIA and serve from AMD (or vice versa) with the same data directory.
+Data is device-agnostic — you can ingest on NVIDIA and serve from AMD or CPU (or any combination) with the same data directory.
-## Claude Code skill
+## MCP server (agent integration)
-This tool is designed to be wrapped as a Claude Code skill. See `SKILL.md` for the skill definition.
+The MCP server exposes kb operations as native MCP tools over Streamable HTTP, so agents can search, add notes, upload files, and manage documents without shelling out to the CLI. Includes setup guides for Claude Code, VS Code, Cursor, Windsurf, and JetBrains IDEs.
 See **[MCP.md](MCP.md)** for full details — server setup, available tools, collections, configuration, and client examples.
 ## Agent skill
 If you are restricted from using MCP server, or you just prefer to utilise Agent SKILLS, please also see `SKILL.md` for the skill definition.
@@ -79,6 +79,12 @@ kb jobs --status failed --format json    # filter by status
 kb jobs <job_id> --format json           # job details
 ```
 ## Examples
 ```bash
 kb examples                              # show common usage examples
 ```
 ## Engine status and maintenance
 ```bash
@@ -102,18 +108,14 @@ All commands support:
    {
      "chunk_id": 1423,
      "score": 0.031,
      "score_breakdown": {"fts": 0.016, "vector": 0.015},
      "text": "To install the latest version of git from source...",
-      "source": {
+      "chunk_index": 3,
-        "document_id": 42,
+      "chunk_metadata": {"page": 12},
-        "title": "Git Admin Guide",
+      "title": "Git Admin Guide",
-        "path": "/home/user/docs/git-admin.pdf",
+      "doc_type": "pdf",
-        "type": "pdf",
+      "source_path": "/home/user/docs/git-admin.pdf",
-        "page": 12,
+      "created_at": "2026-03-15T10:30:00",
-        "chunk_index": 3,
+      "tags": ["git", "admin"]
        "total_chunks": 28,
        "tags": ["git", "admin"]
      }
    }
  ],
  "total_matches": 47,
@@ -156,11 +158,35 @@ Use filters when the question implies a specific domain:
 - From a specific topic → `--tags <topic>`
 - Check available tags first: `kb tags --format json`
 ## Updating notes
 ```bash
 kb updatenote 42 "revised note content"           # update note by ID
 ```
 Updates the text of an existing note in place, preserving its ID, creation timestamp, and tags. Re-chunks and re-embeds the new text.
 ## MCP server (agent integration)
 For agent-to-agent integration, kb provides an MCP server alongside the CLI. The MCP server
 exposes the same operations as native MCP tools over Streamable HTTP transport, which agents
 can connect to directly without subprocess overhead.
 **MCP tools:** `kb_search`, `kb_addnote`, `kb_update_note`, `kb_get`, `kb_status`, `kb_jobs`,
 `kb_upload_start`, `kb_upload_chunk`, `kb_upload_finish`.
 The MCP server supports **collections** — scoped document namespaces (e.g. `memory`, `documents`,
 `workspace`) implemented via tag conventions. This is the recommended way for agents to separate
 their memory from user documents.
 If the kb engine is already running via Docker Compose, add the MCP server by deploying the
 `kb-mcp` service from the same compose file. Agents connect to it on port 3000 (default).
 ## Important notes
 - Always use `--format json` for machine parsing
 - The `score` field is relative, not absolute — compare scores within a result set
- `source.page` is only present for PDF documents
+- `chunk_metadata.page` is only present for PDF documents
- `source.section_header` is only present for markdown documents with headers
+- `chunk_metadata.section_header` is only present for markdown documents with headers
 - Results are already ranked by relevance (hybrid FTS + vector search)
 - Duplicate files are detected at upload time (HTTP 409) — the client handles this gracefully
@@ -1 +1 @@
-2.0.0
+3.0.0
@@ -1 +1 @@
-2.2.0
+3.0.0
@@ -23,6 +23,9 @@ Search:
  kb search "how to restart nginx"
  kb search "deploy" --tags ops --top 5
 Update notes:
  kb updatenote 42 "revised note content"
 Manage documents:
  kb list --type pdf
  kb info 3
@@ -67,15 +67,12 @@ func runSearch(cmd *cobra.Command, args []string) error {
 	var result struct {
 		Results []struct {
-			Score    float64 `json:"score"`
+			Score         float64                `json:"score"`
-			Document struct {
+			Title         string                 `json:"title"`
-				Title string `json:"title"`
+			DocType       string                 `json:"doc_type"`
-				Type  string `json:"doc_type"`
+			Tags          []string               `json:"tags"`
-				Tags  []string `json:"tags"`
+			ChunkMetadata map[string]interface{} `json:"chunk_metadata"`
-			} `json:"document"`
+			Text          string                 `json:"text"`
 			Page    interface{} `json:"page"`
 			Section string      `json:"section"`
 			Text    string      `json:"text"`
 		} `json:"results"`
 	}
@@ -103,26 +100,28 @@ func runSearch(cmd *cobra.Command, args []string) error {
 			snippet = snippet[:200] + "..."
 		}
-		fmt.Printf("\n%d. [%.4f] %s\n", i+1, r.Score, r.Document.Title)
+		fmt.Printf("\n%d. [%.4f] %s\n", i+1, r.Score, r.Title)
 		location := ""
-		if r.Page != nil {
+		if page, ok := r.ChunkMetadata["page"]; ok && page != nil {
-			location = fmt.Sprintf("Page %v", r.Page)
+			location = fmt.Sprintf("Page %v", page)
 		}
-		if r.Section != "" {
+		if section, ok := r.ChunkMetadata["section_header"]; ok && section != nil {
-			if location != "" {
+			if s, ok := section.(string); ok && s != "" {
-				location += " / "
+				if location != "" {
 					location += " / "
 				}
 				location += s
 			}
 			location += r.Section
 		}
 		if location != "" {
 			fmt.Printf("   Location: %s\n", location)
 		}
-		if r.Document.Type != "" {
+		if r.DocType != "" {
-			fmt.Printf("   Type: %s\n", r.Document.Type)
+			fmt.Printf("   Type: %s\n", r.DocType)
 		}
-		if len(r.Document.Tags) > 0 {
+		if len(r.Tags) > 0 {
-			fmt.Printf("   Tags: %s\n", joinStrings(r.Document.Tags))
+			fmt.Printf("   Tags: %s\n", joinStrings(r.Tags))
 		}
 		fmt.Printf("   %s\n", snippet)
 	}
@@ -0,0 +1,61 @@
 package cmd
 import (
 	"fmt"
 	"os"
 	"strconv"
 	"github.com/kb-search/kb/internal/api"
 	"github.com/kb-search/kb/internal/output"
 	"github.com/spf13/cobra"
 )
 var updatenoteCmd = &cobra.Command{
 	Use:   "updatenote <id> <text>",
 	Short: "Update an existing note's content",
 	Args: func(cmd *cobra.Command, args []string) error {
 		if len(args) < 2 {
 			return fmt.Errorf("requires document ID and text arguments\n\n  Usage: kb updatenote 42 \"updated note text\"")
 		}
 		if _, err := strconv.Atoi(args[0]); err != nil {
 			return fmt.Errorf("document ID must be an integer, got %q", args[0])
 		}
 		return nil
 	},
 	RunE: runUpdatenote,
 }
 func init() {
 	rootCmd.AddCommand(updatenoteCmd)
 }
 func runUpdatenote(cmd *cobra.Command, args []string) error {
 	docID := args[0]
 	text := args[1]
 	client := api.NewClient()
 	body := map[string]string{"text": text}
 	resp, err := client.Patch(fmt.Sprintf("/api/v1/notes/%s", docID), body)
 	if err != nil {
 		fmt.Fprintln(os.Stderr, err)
 		os.Exit(1)
 	}
 	if err := api.CheckError(resp); err != nil {
 		fmt.Fprintln(os.Stderr, err)
 		os.Exit(1)
 	}
 	var result interface{}
 	if err := api.DecodeJSON(resp, &result); err != nil {
 		return fmt.Errorf("failed to decode response: %w", err)
 	}
 	if output.IsJSON() {
 		output.PrintJSON(result)
 	} else {
 		fmt.Printf("Updated note %s\n", docID)
 	}
 	return nil
 }
@@ -94,6 +94,10 @@ func (c *Client) checkEngineVersion() {
 	}
 	defer resp.Body.Close()
 	if resp.StatusCode != http.StatusOK {
 		return // auth error or other issue — let the actual request surface it
 	}
 	var status struct {
 		Version string `json:"version"`
 	}
@@ -217,6 +221,20 @@ func (c *Client) Put(path string, body interface{}) (*http.Response, error) {
 	return c.do(req)
 }
 // Patch performs a PATCH request with a JSON body.
 func (c *Client) Patch(path string, body interface{}) (*http.Response, error) {
 	data, err := json.Marshal(body)
 	if err != nil {
 		return nil, fmt.Errorf("failed to marshal request body: %w", err)
 	}
 	req, err := c.newRequest(http.MethodPatch, path, bytes.NewReader(data))
 	if err != nil {
 		return nil, err
 	}
 	req.Header.Set("Content-Type", "application/json")
 	return c.do(req)
 }
 // DecodeJSON reads the response body and decodes it into target.
 func DecodeJSON(resp *http.Response, target interface{}) error {
 	defer resp.Body.Close()
@@ -0,0 +1,36 @@
 FROM ubuntu:24.04
 ENV DEBIAN_FRONTEND=noninteractive
 RUN apt-get update && apt-get install -y --no-install-recommends \
    python3.12 python3.12-venv python3.12-dev python3-pip \
    libpoppler-cpp-dev poppler-utils \
    libgl1 libglib2.0-0 \
    build-essential curl \
    && rm -rf /var/lib/apt/lists/*
 COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
 WORKDIR /app
 COPY pyproject.toml ./
 COPY kb/ kb/
 COPY main.py ./
 COPY VERSION ./
 RUN uv venv .venv && \
    . .venv/bin/activate && \
    uv pip install -e . && \
    uv pip install "sentence-transformers[onnx]" && \
    uv pip install --reinstall torch torchvision --index-url https://download.pytorch.org/whl/cpu
 ENV PATH="/app/.venv/bin:$PATH"
 ENV VIRTUAL_ENV="/app/.venv"
 ENV KB_DEVICE=cpu
 ENV KB_INGEST_DEVICE=cpu
 ENV KB_DATA_DIR=/data
 EXPOSE 8000
 VOLUME ["/data"]
 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
@@ -1 +1 @@
-2.1.0
+3.0.0
@@ -0,0 +1,31 @@
 services:
  kb-engine:
    build:
      context: .
      dockerfile: Dockerfile.cpu
    ports:
      - "${KB_PORT:-8000}:8000"
    volumes:
      - ${KB_DATA_PATH:-./data}:/data
    environment:
      - KB_MODEL=${KB_MODEL:-all-MiniLM-L6-v2}
      - KB_DEVICE=cpu
      - KB_INGEST_DEVICE=cpu
      - KB_API_KEY=${KB_API_KEY:-}
      - KB_SEARCH_THRESHOLD=${KB_SEARCH_THRESHOLD:-0.01}
      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-}
    restart: unless-stopped
  kb-mcp:
    build:
      context: ../mcp
      dockerfile: Dockerfile
    ports:
      - "${KB_MCP_PORT:-3000}:3000"
    environment:
      - KB_ENGINE_URL=http://kb-engine:8000
      - KB_API_KEY=${KB_API_KEY:-}
      - KB_MCP_API_KEY=${KB_MCP_API_KEY:-}
    depends_on:
      - kb-engine
    restart: unless-stopped
@@ -23,3 +23,17 @@ services:
      - KB_SEARCH_THRESHOLD=${KB_SEARCH_THRESHOLD:-0.01}
      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-}
    restart: unless-stopped
  kb-mcp:
    build:
      context: ../mcp
      dockerfile: Dockerfile
    ports:
      - "${KB_MCP_PORT:-3000}:3000"
    environment:
      - KB_ENGINE_URL=http://kb-engine:8000
      - KB_API_KEY=${KB_API_KEY:-}
      - KB_MCP_API_KEY=${KB_MCP_API_KEY:-}
    depends_on:
      - kb-engine
    restart: unless-stopped
@@ -20,3 +20,17 @@ services:
      - KB_SEARCH_THRESHOLD=${KB_SEARCH_THRESHOLD:-0.01}
      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-}
    restart: unless-stopped
  kb-mcp:
    build:
      context: ../mcp
      dockerfile: Dockerfile
    ports:
      - "${KB_MCP_PORT:-3000}:3000"
    environment:
      - KB_ENGINE_URL=http://kb-engine:8000
      - KB_API_KEY=${KB_API_KEY:-}
      - KB_MCP_API_KEY=${KB_MCP_API_KEY:-}
    depends_on:
      - kb-engine
    restart: unless-stopped
@@ -185,6 +185,10 @@ def init_schema(conn: sqlite3.Connection, embedding_dim: int) -> None:
        _backfill_enriched_text(conn)
        _rebuild_fts(conn)
    # Migrate: add updated_at to documents if missing (v3.0.0)
    if "updated_at" not in doc_cols:
        conn.execute("ALTER TABLE documents ADD COLUMN updated_at TEXT")
    conn.commit()
@@ -1 +1 @@
-from kb.routes import health, search, jobs, documents, tags, status, reindex, auth
+from kb.routes import health, search, jobs, documents, tags, status, reindex, auth, notes
@@ -26,7 +26,7 @@ async def list_documents(
        sql = """
            SELECT d.id, d.title, d.doc_type,
                   (SELECT COUNT(*) FROM chunks c WHERE c.document_id = d.id) AS chunk_count,
-                   d.created_at
+                   d.created_at, d.updated_at
            FROM documents d
        """
        joins: list[str] = []
@@ -50,7 +50,7 @@ async def list_documents(
        if where:
            sql += " WHERE " + " AND ".join(where)
-        sql += " ORDER BY d.created_at DESC"
+        sql += " ORDER BY COALESCE(d.updated_at, d.created_at) DESC"
        rows = conn.execute(sql, params).fetchall()
@@ -74,6 +74,7 @@ async def list_documents(
                "tags": [t["name"] for t in tag_rows],
                "chunk_count": row["chunk_count"],
                "created_at": row["created_at"],
                "updated_at": row["updated_at"],
            })
        return results
@@ -0,0 +1,120 @@
 """Note mutation endpoint — update existing notes in place."""
 import hashlib
 import logging
 from fastapi import HTTPException
 from pydantic import BaseModel
 from main import app
 from kb.config import cfg
 from kb.database import (
    get_connection,
    build_enriched_text,
    insert_chunk,
    insert_embedding,
 )
 from kb.embeddings import embed_texts
 from kb.ingest.note import chunk_note
 logger = logging.getLogger("kb.routes.notes")
 class NoteUpdateRequest(BaseModel):
    text: str
@app.patch("/api/v1/notes/{doc_id}")
 async def update_note(doc_id: int, req: NoteUpdateRequest):
    conn = get_connection(cfg.db_path)
    try:
        doc = conn.execute(
            "SELECT id, title, doc_type FROM documents WHERE id = ?", (doc_id,)
        ).fetchone()
        if not doc:
            raise HTTPException(status_code=404, detail="Document not found.")
        if doc["doc_type"] != "note":
            raise HTTPException(
                status_code=422,
                detail="Only notes can be updated via this endpoint.",
            )
        title = doc["title"]
        # Delete existing chunks and their embeddings
        chunk_ids = conn.execute(
            "SELECT id FROM chunks WHERE document_id = ?", (doc_id,)
        ).fetchall()
        for row in chunk_ids:
            conn.execute("DELETE FROM chunks_vec WHERE chunk_id = ?", (row["id"],))
        conn.execute("DELETE FROM chunks WHERE document_id = ?", (doc_id,))
        # Run note chunking pipeline on new text
        chunks = chunk_note(req.text)
        chunk_texts = [c["text"] for c in chunks]
        chunk_metas = [
            {k: v for k, v in c.items() if k != "text"} or None for c in chunks
        ]
        enriched_texts = [
            build_enriched_text(title, ct, cm)
            for ct, cm in zip(chunk_texts, chunk_metas)
        ]
        # Embed — if this fails, the transaction rolls back
        vectors = embed_texts(enriched_texts)
        for idx, (chunk_text, enriched, vector) in enumerate(
            zip(chunk_texts, enriched_texts, vectors)
        ):
            chunk_id = insert_chunk(
                conn,
                document_id=doc_id,
                chunk_index=idx,
                text=chunk_text,
                enriched_text=enriched,
                metadata=chunk_metas[idx],
            )
            insert_embedding(conn, chunk_id, vector)
        # Update content_hash and updated_at
        content_hash = hashlib.sha256(req.text.encode("utf-8")).hexdigest()
        conn.execute(
            "UPDATE documents SET content_hash = ?, updated_at = current_timestamp WHERE id = ?",
            (content_hash, doc_id),
        )
        conn.commit()
        # Return updated document
        updated_doc = conn.execute(
            "SELECT * FROM documents WHERE id = ?", (doc_id,)
        ).fetchone()
        new_chunks = conn.execute(
            "SELECT * FROM chunks WHERE document_id = ? ORDER BY chunk_index",
            (doc_id,),
        ).fetchall()
        tag_rows = conn.execute(
            """
            SELECT t.name FROM tags t
            JOIN document_tags dt ON t.id = dt.tag_id
            WHERE dt.document_id = ?
            ORDER BY t.name
            """,
            (doc_id,),
        ).fetchall()
        return {
            **dict(updated_doc),
            "tags": [t["name"] for t in tag_rows],
            "chunks": [dict(c) for c in new_chunks],
        }
    except HTTPException:
        raise
    except Exception:
        conn.rollback()
        logger.exception("Failed to update note %d", doc_id)
        raise HTTPException(status_code=500, detail="Failed to update note.")
    finally:
        conn.close()
@@ -48,6 +48,13 @@ async def update_document_tags(doc_id: int, req: TagUpdateRequest):
        if req.remove:
            untag_document(conn, doc_id, req.remove)
        if req.add or req.remove:
            conn.execute(
                "UPDATE documents SET updated_at = current_timestamp WHERE id = ?",
                (doc_id,),
            )
            conn.commit()
        tag_rows = conn.execute(
            """
            SELECT t.name FROM tags t
@@ -62,7 +62,7 @@ async def lifespan(app: FastAPI):
 app = FastAPI(title="kb-engine", version=__version__, lifespan=lifespan)
 # Import routes after app is created
-from kb.routes import health, search, jobs, documents, tags, status, reindex, auth  # noqa: E402, F401
+from kb.routes import health, search, jobs, documents, tags, status, reindex, auth, notes  # noqa: E402, F401
 if __name__ == "__main__":
    import uvicorn
@@ -0,0 +1,17 @@
 FROM python:3.12-slim
 WORKDIR /app
 COPY requirements.txt ./
 RUN pip install --no-cache-dir -r requirements.txt
 COPY *.py ./
 ENV KB_ENGINE_URL=http://engine:8000
 ENV KB_API_KEY=
 ENV KB_MCP_API_KEY=
 ENV KB_MCP_PORT=3000
 EXPOSE 3000
 CMD ["python", "server.py"]
@@ -0,0 +1,9 @@
 """Configuration from environment variables."""
 import os
 KB_ENGINE_URL = os.environ.get("KB_ENGINE_URL", "http://localhost:8000")
 KB_API_KEY = os.environ.get("KB_API_KEY", "")
 KB_MCP_API_KEY = os.environ.get("KB_MCP_API_KEY", "")
 KB_MCP_PORT = int(os.environ.get("KB_MCP_PORT", "3000"))
@@ -0,0 +1,121 @@
 """HTTP client for the kb engine API."""
 import httpx
 from config import KB_ENGINE_URL, KB_API_KEY
 def _auth_headers() -> dict[str, str]:
    h: dict[str, str] = {}
    if KB_API_KEY:
        h["Authorization"] = f"Bearer {KB_API_KEY}"
    return h
 def _client() -> httpx.Client:
    return httpx.Client(base_url=KB_ENGINE_URL, headers=_auth_headers(), timeout=60.0)
 def search(query: str, top: int = 10, tags: list[str] | None = None,
           doc_type: str | None = None, fts_only: bool = False,
           vec_only: bool = False, threshold: float | None = None) -> dict:
    body: dict = {"query": query, "top": top}
    if tags:
        body["tags"] = tags
    if doc_type:
        body["doc_type"] = doc_type
    if fts_only:
        body["fts_only"] = True
    if vec_only:
        body["vec_only"] = True
    if threshold is not None:
        body["threshold"] = threshold
    with _client() as c:
        r = c.post("/api/v1/search", json=body)
        r.raise_for_status()
        return r.json()
 def add_note(text: str, tags: list[str] | None = None,
             title: str | None = None) -> dict:
    fields = {"note": text}
    if tags:
        fields["tags"] = ",".join(tags)
    if title:
        fields["title"] = title
    with _client() as c:
        r = c.post("/api/v1/jobs", data=fields)
        r.raise_for_status()
        return r.json()
 def update_note(doc_id: int, text: str) -> dict:
    with _client() as c:
        r = c.patch(f"/api/v1/notes/{doc_id}", json={"text": text})
        r.raise_for_status()
        return r.json()
 def get_document(doc_id: int) -> dict:
    with _client() as c:
        r = c.get(f"/api/v1/documents/{doc_id}")
        r.raise_for_status()
        return r.json()
 def list_documents(doc_type: str | None = None,
                   tags: str | None = None) -> list[dict]:
    params: dict = {}
    if doc_type:
        params["type"] = doc_type
    if tags:
        params["tags"] = tags
    with _client() as c:
        r = c.get("/api/v1/documents", params=params)
        r.raise_for_status()
        return r.json()
 def get_status() -> dict:
    with _client() as c:
        r = c.get("/api/v1/status")
        r.raise_for_status()
        return r.json()
 def list_jobs(status: str | None = None) -> list[dict]:
    params: dict = {}
    if status:
        params["status"] = status
    with _client() as c:
        r = c.get("/api/v1/jobs", params=params)
        r.raise_for_status()
        return r.json()
 def update_tags(doc_id: int, add: list[str] | None = None,
                remove: list[str] | None = None) -> dict:
    body: dict = {}
    if add:
        body["add"] = add
    if remove:
        body["remove"] = remove
    with _client() as c:
        r = c.put(f"/api/v1/documents/{doc_id}/tags", json=body)
        r.raise_for_status()
        return r.json()
 def upload_file(filename: str, file_bytes: bytes,
                tags: list[str] | None = None) -> dict:
    fields: dict = {}
    if tags:
        fields["tags"] = ",".join(tags)
    with _client() as c:
        r = c.post(
            "/api/v1/jobs",
            data=fields,
            files={"file": (filename, file_bytes)},
        )
        r.raise_for_status()
        return r.json()
@@ -0,0 +1,4 @@
 mcp>=1.9.0
 httpx>=0.27
 uvicorn>=0.30
 starlette>=0.38
@@ -0,0 +1,384 @@
 """kb MCP server — exposes knowledge base operations as MCP tools."""
 import asyncio
 import json
 import logging
 from mcp.server.fastmcp import FastMCP
 from starlette.applications import Starlette
 from starlette.middleware import Middleware
 from starlette.middleware.base import BaseHTTPMiddleware
 from starlette.requests import Request
 from starlette.responses import JSONResponse
 from starlette.routing import Mount
 import config
 import engine
 import uploads
 logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
 logger = logging.getLogger("kb.mcp")
 # ---------------------------------------------------------------------------
 # Collection helpers
 # ---------------------------------------------------------------------------
 COLLECTION_TAG_PREFIX = "collection:"
 DEFAULT_COLLECTION = "documents"
 def _collection_tag(collection: str | None) -> str:
    return f"{COLLECTION_TAG_PREFIX}{collection or DEFAULT_COLLECTION}"
 def _strip_collection_tags(tags: list[str]) -> tuple[str | None, list[str]]:
    """Split tags into (collection, remaining_tags)."""
    collection = None
    remaining = []
    for t in tags:
        if t.startswith(COLLECTION_TAG_PREFIX):
            collection = t[len(COLLECTION_TAG_PREFIX):]
        else:
            remaining.append(t)
    return collection, remaining
 def _process_document(doc: dict) -> dict:
    """Strip collection tags from a document dict and add collection field."""
    tags = doc.get("tags", [])
    collection, clean_tags = _strip_collection_tags(tags)
    doc["tags"] = clean_tags
    doc["collection"] = collection
    return doc
 def _process_search_results(results: list[dict]) -> list[dict]:
    """Strip collection tags from search result dicts."""
    for r in results:
        if "tags" in r:
            collection, clean_tags = _strip_collection_tags(r["tags"])
            r["tags"] = clean_tags
            r["collection"] = collection
        if "document" in r and "tags" in r["document"]:
            collection, clean_tags = _strip_collection_tags(r["document"]["tags"])
            r["document"]["tags"] = clean_tags
            r["document"]["collection"] = collection
    return results
 async def _ensure_exclusive_collection(doc_id: int, collection: str) -> None:
    """Remove existing collection tags and apply the new one."""
    doc = engine.get_document(doc_id)
    existing_collection_tags = [
        t for t in doc.get("tags", [])
        if t.startswith(COLLECTION_TAG_PREFIX)
    ]
    new_tag = _collection_tag(collection)
    if existing_collection_tags == [new_tag]:
        return
    if existing_collection_tags:
        engine.update_tags(doc_id, remove=existing_collection_tags)
    engine.update_tags(doc_id, add=[new_tag])
 # ---------------------------------------------------------------------------
 # FastMCP server
 # ---------------------------------------------------------------------------
 mcp = FastMCP(
    "kb",
    instructions=(
        "Knowledge base MCP server. Provides tools for searching, adding, and "
        "managing documents and notes. This server requires Bearer token "
        "authentication — all requests are authenticated via the Authorization "
        "header at the HTTP transport layer."
    ),
 )
@mcp.tool()
 async def kb_search(
    query: str,
    top: int = 10,
    tags: list[str] | None = None,
    doc_type: str | None = None,
    collection: str | None = None,
    fts_only: bool = False,
 ) -> str:
    """Search the knowledge base for relevant documents and notes.
    Returns ranked chunks matching the query, with text content, relevance scores,
    and document metadata.
    Args:
        query: The search query. Can be a natural language question or keywords.
        top: Maximum number of results to return (default 10).
        tags: Filter results to documents with ALL of these tags.
        doc_type: Filter by document type (e.g. "note", "pdf", "markdown", "code").
        collection: Filter by collection name (e.g. "documents", "memory", "workspace").
        fts_only: If true, use only full-text search (no vector similarity).
    Tips for complex queries:
    - Consider expanding into 2-3 variant phrasings and calling this tool multiple
      times, then deduplicating results by chunk_id. For example, search for both
      "pension revaluation rules" and "how are pensions revalued" to cast a wider net.
    - For precision, rerank the returned results using your own judgement based on
      relevance to the original question.
    """
    search_tags = list(tags) if tags else []
    if collection:
        search_tags.append(_collection_tag(collection))
    result = engine.search(
        query=query,
        top=top,
        tags=search_tags or None,
        doc_type=doc_type,
        fts_only=fts_only,
    )
    results_list = result if isinstance(result, list) else result.get("results", [])
    processed = _process_search_results(results_list)
    return json.dumps(processed, indent=2)
@mcp.tool()
 async def kb_addnote(
    text: str,
    collection: str | None = None,
    tags: list[str] | None = None,
    title: str | None = None,
 ) -> str:
    """Add a text note to the knowledge base for indexing and search.
    The note is queued for ingestion — it will be chunked, embedded, and made
    searchable. Use kb_jobs to check ingestion status.
    Args:
        text: The note text content.
        collection: Collection to add the note to (default "documents").
            Standard collections: "documents", "memory", "workspace".
        tags: Additional tags to apply to the note.
        title: Optional title (auto-derived from first line if omitted).
    """
    all_tags = list(tags) if tags else []
    all_tags.append(_collection_tag(collection))
    result = engine.add_note(text=text, tags=all_tags, title=title)
    return json.dumps(result, indent=2)
@mcp.tool()
 async def kb_update_note(
    document_id: int,
    text: str,
 ) -> str:
    """Update an existing note's content in place.
    Replaces the note text, re-chunks, and re-embeds while preserving the
    document ID, creation timestamp, and tags. Only works on documents with
    doc_type "note".
    Args:
        document_id: The ID of the note document to update.
        text: The new text content for the note.
    """
    result = engine.update_note(document_id, text)
    return json.dumps(_process_document(result), indent=2)
@mcp.tool()
 async def kb_get(
    document_id: int | None = None,
    source_path: str | None = None,
 ) -> str:
    """Retrieve document details from the knowledge base.
    Look up a document by its ID or source path. Returns full document metadata,
    tags, and chunk contents.
    Args:
        document_id: The numeric document ID.
        source_path: The document's source path (alternative to document_id).
    """
    if document_id is not None:
        result = engine.get_document(document_id)
        return json.dumps(_process_document(result), indent=2)
    elif source_path is not None:
        docs = engine.list_documents()
        matches = [d for d in docs if d.get("source_path") == source_path]
        if not matches:
            return json.dumps({"error": "No document found with that source_path"})
        doc = engine.get_document(matches[0]["id"])
        return json.dumps(_process_document(doc), indent=2)
    else:
        return json.dumps({"error": "Provide either document_id or source_path"})
@mcp.tool()
 async def kb_status() -> str:
    """Get knowledge base engine status.
    Returns engine version, embedding model info, device info, document counts,
    database size, and ingestion queue state.
    """
    result = engine.get_status()
    result["authenticated"] = bool(config.KB_MCP_API_KEY)
    return json.dumps(result, indent=2)
@mcp.tool()
 async def kb_jobs(
    status: str | None = None,
 ) -> str:
    """List ingestion jobs and their status.
    Returns recent jobs showing what has been queued, is processing, completed,
    or failed.
    Args:
        status: Filter by job status ("queued", "processing", "done", "failed", "skipped").
    """
    result = engine.list_jobs(status=status)
    return json.dumps(result, indent=2)
@mcp.tool()
 async def kb_upload_start(
    filename: str,
    total_size: int,
    tags: list[str] | None = None,
    collection: str | None = None,
 ) -> str:
    """Start a chunked file upload to the knowledge base.
    Use this for uploading files from a remote agent. The upload process is:
    1. Call kb_upload_start to get an upload_id
    2. Call kb_upload_chunk repeatedly with base64-encoded file chunks (recommended ~1MB each)
    3. Call kb_upload_finish to submit the file for ingestion
    Example for a 3MB file:
        upload = kb_upload_start(filename="report.pdf", total_size=3145728, collection="documents")
        kb_upload_chunk(upload_id=upload["upload_id"], data="<base64 chunk 0>", chunk_index=0)
        kb_upload_chunk(upload_id=upload["upload_id"], data="<base64 chunk 1>", chunk_index=1)
        kb_upload_chunk(upload_id=upload["upload_id"], data="<base64 chunk 2>", chunk_index=2)
        result = kb_upload_finish(upload_id=upload["upload_id"])
    Args:
        filename: Original filename (used for type detection).
        total_size: Total file size in bytes.
        tags: Additional tags to apply.
        collection: Collection name (default "documents").
    """
    all_tags = list(tags) if tags else []
    all_tags.append(_collection_tag(collection))
    upload_id = uploads.start_upload(filename, total_size, all_tags)
    return json.dumps({"upload_id": upload_id})
@mcp.tool()
 async def kb_upload_chunk(
    upload_id: str,
    data: str,
    chunk_index: int,
 ) -> str:
    """Upload a base64-encoded chunk of a file.
    Part of the chunked upload flow started by kb_upload_start.
    Args:
        upload_id: The upload ID from kb_upload_start.
        data: Base64-encoded file data for this chunk.
        chunk_index: Zero-based index of this chunk.
    """
    try:
        uploads.add_chunk(upload_id, data, chunk_index)
        return json.dumps({"status": "ok", "chunk_index": chunk_index})
    except KeyError as e:
        return json.dumps({"error": str(e)})
@mcp.tool()
 async def kb_upload_finish(
    upload_id: str,
 ) -> str:
    """Finish a chunked upload and submit the file for ingestion.
    Reassembles all uploaded chunks and forwards the complete file to the
    engine for processing. Returns the ingestion job ID.
    Args:
        upload_id: The upload ID from kb_upload_start.
    """
    try:
        filename, file_bytes, tags = uploads.finish_upload(upload_id)
        result = engine.upload_file(filename, file_bytes, tags)
        return json.dumps(result, indent=2)
    except KeyError as e:
        return json.dumps({"error": str(e)})
 # ---------------------------------------------------------------------------
 # Auth middleware
 # ---------------------------------------------------------------------------
 class BearerAuthMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        if not config.KB_MCP_API_KEY:
            return await call_next(request)
        auth_header = request.headers.get("authorization", "")
        if auth_header.startswith("Bearer ") and auth_header[7:] == config.KB_MCP_API_KEY:
            return await call_next(request)
        return JSONResponse(
            status_code=401,
            content={"error": "Unauthorized"},
        )
 # ---------------------------------------------------------------------------
 # ASGI app assembly
 # ---------------------------------------------------------------------------
 def create_app():
    """Create the ASGI app with auth middleware wrapping the MCP server."""
    from contextlib import asynccontextmanager
    mcp_app = mcp.streamable_http_app()
    @asynccontextmanager
    async def lifespan(app):
        uploads.start_cleanup_task()
        logger.info("Upload cleanup task started")
        # Delegate to the MCP app's lifespan if it has one
        if hasattr(mcp_app, 'router') and hasattr(mcp_app.router, 'lifespan_context'):
            async with mcp_app.router.lifespan_context(app):
                yield
        else:
            yield
    app = Starlette(
        routes=[Mount("/", app=mcp_app)],
        middleware=[Middleware(BearerAuthMiddleware)],
        lifespan=lifespan,
    )
    return app
 # ---------------------------------------------------------------------------
 # Entry point
 # ---------------------------------------------------------------------------
 if __name__ == "__main__":
    import uvicorn
    logger.info(
        "Starting kb MCP server on port %d, engine=%s",
        config.KB_MCP_PORT,
        config.KB_ENGINE_URL,
    )
    app = create_app()
    uvicorn.run(app, host="0.0.0.0", port=config.KB_MCP_PORT)
@@ -0,0 +1,96 @@
 """Chunked upload staging management."""
 import asyncio
 import base64
 import logging
 import shutil
 import tempfile
 import time
 import uuid
 from dataclasses import dataclass, field
 from pathlib import Path
 logger = logging.getLogger("kb.mcp.uploads")
 UPLOAD_TIMEOUT_SECONDS = 600  # 10 minutes
@dataclass
 class StagedUpload:
    upload_id: str
    filename: str
    total_size: int
    tags: list[str]
    staging_dir: Path
    created_at: float = field(default_factory=time.time)
    chunks: dict[int, Path] = field(default_factory=dict)
 _uploads: dict[str, StagedUpload] = {}
 _cleanup_task: asyncio.Task | None = None
 def start_upload(filename: str, total_size: int, tags: list[str]) -> str:
    upload_id = str(uuid.uuid4())
    staging_dir = Path(tempfile.mkdtemp(prefix=f"kb_upload_{upload_id[:8]}_"))
    _uploads[upload_id] = StagedUpload(
        upload_id=upload_id,
        filename=filename,
        total_size=total_size,
        tags=tags,
        staging_dir=staging_dir,
    )
    logger.info("Started upload %s for %s (%d bytes)", upload_id, filename, total_size)
    return upload_id
 def add_chunk(upload_id: str, data_b64: str, chunk_index: int) -> None:
    upload = _uploads.get(upload_id)
    if upload is None:
        raise KeyError(f"Upload ID not found: {upload_id}")
    chunk_bytes = base64.b64decode(data_b64)
    chunk_path = upload.staging_dir / f"chunk_{chunk_index:06d}"
    chunk_path.write_bytes(chunk_bytes)
    upload.chunks[chunk_index] = chunk_path
    logger.info("Added chunk %d to upload %s (%d bytes)", chunk_index, upload_id, len(chunk_bytes))
 def finish_upload(upload_id: str) -> tuple[str, bytes, list[str]]:
    """Reassemble chunks and return (filename, file_bytes, tags)."""
    upload = _uploads.get(upload_id)
    if upload is None:
        raise KeyError(f"Upload ID not found: {upload_id}")
    try:
        parts = []
        for idx in sorted(upload.chunks.keys()):
            parts.append(upload.chunks[idx].read_bytes())
        file_bytes = b"".join(parts)
        return upload.filename, file_bytes, upload.tags
    finally:
        _cleanup_upload(upload_id)
 def _cleanup_upload(upload_id: str) -> None:
    upload = _uploads.pop(upload_id, None)
    if upload and upload.staging_dir.exists():
        shutil.rmtree(upload.staging_dir, ignore_errors=True)
 async def cleanup_abandoned_uploads() -> None:
    """Background task that removes uploads older than the timeout."""
    while True:
        await asyncio.sleep(60)
        now = time.time()
        expired = [
            uid for uid, u in _uploads.items()
            if now - u.created_at > UPLOAD_TIMEOUT_SECONDS
        ]
        for uid in expired:
            logger.warning("Cleaning up abandoned upload %s", uid)
            _cleanup_upload(uid)
 def start_cleanup_task() -> None:
    global _cleanup_task
    if _cleanup_task is None or _cleanup_task.done():
        _cleanup_task = asyncio.create_task(cleanup_abandoned_uploads())
@@ -0,0 +1,2 @@
 schema: spec-driven
 created: 2026-04-02
@@ -0,0 +1,41 @@
 ## Context
 The engine's `/api/v1/search` endpoint returns flat result objects:
 ```json
 {
  "chunk_id": 123,
  "score": 0.031,
  "text": "...",
  "chunk_index": 3,
  "chunk_metadata": {"page": 12, "section_header": "Installation"},
  "title": "Git Admin Guide",
  "doc_type": "pdf",
  "source_path": "/home/user/docs/git-admin.pdf",
  "created_at": "2026-03-15T10:30:00",
  "tags": ["git", "admin"]
 }
 ```
 The Go client's human-mode struct in `client/cmd/search.go` incorrectly expects a nested `document` object and top-level `page`/`section` fields. This causes all metadata to display as zero values.
 ## Goals / Non-Goals
 **Goals:**
 - Fix the search result struct to match the flat engine response
 - Extract `page` and `section_header` from `chunk_metadata` for human display
 - Maintain identical JSON output (already passes through raw response)
 **Non-Goals:**
 - Changing the engine API response format
 - Adding new display fields beyond what was originally intended
 ## Decisions
 **Flatten the struct to match API response.** The result struct will have `Title`, `DocType`, `Tags` as top-level fields (matching `title`, `doc_type`, `tags` JSON keys). `ChunkMetadata` will be decoded as `map[string]interface{}` to extract `page` and `section_header` dynamically, since its contents vary by document type.
 **Why not a typed ChunkMetadata struct?** The metadata keys depend on the ingestion pipeline (PDFs have `page`, markdown has `section_header`, code may have others in future). A map is more resilient to engine-side additions.
 ## Risks / Trade-offs
 - [Minimal risk] If the engine adds new top-level fields, the Go struct silently ignores them — this is existing behavior and acceptable for human-mode display.
@@ -0,0 +1,24 @@
 ## Why
 The Go client's human-mode search output struct expects a nested `document` object and top-level `page`/`section` fields, but the engine API returns flat results with `title`, `doc_type`, `tags` at the result level and `page`/`section_header` inside `chunk_metadata`. This means human-mode display shows empty values for title, type, tags, page, and section.
 ## What Changes
 - Fix the Go client search result struct to match the flat engine API response format
 - Extract `page` and `section_header` from the `chunk_metadata` map instead of expecting them as top-level fields
 - Human-mode output will correctly display document title, type, tags, page number, and section header
 ## Capabilities
 ### New Capabilities
 (none)
 ### Modified Capabilities
 - `go-client`: Fix search result parsing to match actual engine API response shape
 ## Impact
 - `client/cmd/search.go` — struct definition and display logic
 - No API changes, no breaking changes — this is a bug fix aligning the client with the existing API contract
@@ -0,0 +1,40 @@
 ## MODIFIED Requirements
 ### Requirement: Search command
 The client SHALL provide a `kb search <query>` command that sends the query to the engine and displays results.
 #### Scenario: Human-readable search output
 - **WHEN** the user runs `kb search "how to change oil"`
 - **THEN** the client SHALL POST to `/api/v1/search`, and display results in a human-readable format showing rank, score, document title, page/section, doc type, tags, and a text snippet
 - **THEN** the client SHALL parse search results as flat objects with top-level `title`, `doc_type`, `tags`, `score`, `text`, `chunk_index` fields
 - **THEN** the client SHALL extract `page` from `chunk_metadata` when present (PDF documents)
 - **THEN** the client SHALL extract `section_header` from `chunk_metadata` when present (markdown documents)
 #### Scenario: JSON search output
 - **WHEN** the user runs `kb search "query" --format json`
 - **THEN** the client SHALL output the raw JSON response from the engine
 #### Scenario: Search with filters
 - **WHEN** the user runs `kb search "brakes" --tags maintenance --type pdf --top 3`
 - **THEN** the client SHALL include the filters in the API request body
 #### Scenario: Search mode flags
 - **WHEN** the user runs `kb search "error" --fts-only`
 - **THEN** the client SHALL set `fts_only: true` in the request body
 #### Scenario: PDF result with page number
 - **WHEN** a search result has `chunk_metadata` containing `{"page": 12}`
 - **THEN** the human output SHALL display "Page 12" in the location line
 #### Scenario: Markdown result with section header
 - **WHEN** a search result has `chunk_metadata` containing `{"section_header": "Installation > Prerequisites"}`
 - **THEN** the human output SHALL display "Installation > Prerequisites" in the location line
 #### Scenario: Result with both page and section
 - **WHEN** a search result has `chunk_metadata` containing both `page` and `section_header`
 - **THEN** the human output SHALL display both separated by " / "
 #### Scenario: Result with no location metadata
 - **WHEN** a search result has empty `chunk_metadata` or no page/section keys
 - **THEN** the human output SHALL omit the location line entirely
@@ -0,0 +1,14 @@
 ## 1. Fix search result struct
 - [x] 1.1 Replace nested `Document` struct with flat fields (`Title`, `DocType`, `Tags`) matching engine JSON keys
 - [x] 1.2 Add `ChunkMetadata map[string]interface{}` field to capture `chunk_metadata`
 ## 2. Fix display logic
 - [x] 2.1 Update title/type/tags references in the display loop to use the new flat fields
 - [x] 2.2 Extract `page` from `ChunkMetadata` map (replacing top-level `Page` field)
 - [x] 2.3 Extract `section_header` from `ChunkMetadata` map (replacing top-level `Section` field)
 ## 3. Verify
 - [x] 3.1 Build the client and verify it compiles cleanly
@@ -0,0 +1,145 @@
 ## Context
 kb v2 is a client-server knowledge base: a Python FastAPI engine (SQLite + FTS5 + sqlite-vec, sentence-transformers embeddings) serving a Go CLI client over HTTP. Agent integration currently works via a Claude Code skill that shells out to the Go binary and parses JSON output.
 The engine runs in Docker (NVIDIA/ROCm/CPU variants), keeps the embedding model warm in memory, and handles async ingestion via a background worker. The data model has documents, chunks, embeddings, tags, and jobs — but no concept of collections or note mutation.
 This design covers three changes: adding an MCP server as a new integration surface, adding collection-scoped search via tag conventions, and adding in-place note updates.
 ## Goals / Non-Goals
 **Goals:**
 - Expose kb as native MCP tools so agents interact with it directly, not via shell subprocess
 - Separate agent memory from user documents via collection tags
 - Allow notes to be updated in place, preserving document identity
 - Support file upload from remote agents via the MCP server
 - Keep the engine fully local — no cloud API dependencies
 - Maintain backward compatibility: existing CLI, API, and data all continue to work
 **Non-Goals:**
 - Query expansion or LLM reranking inside the engine (agent-side responsibility)
 - File-watching / inotify for auto-reindexing (useful but separate concern)
 - Collection-level access control or permissions
 - New schema columns for collections (use existing tags)
 - Stdio MCP transport (Streamable HTTP only)
 ## Decisions
 ### D1 — MCP server as a separate container, Streamable HTTP transport, with its own auth
 The MCP server runs as its own Docker container alongside the engine, exposed via Streamable HTTP. It is not embedded into the FastAPI engine app. It requires its own Bearer token (`KB_MCP_API_KEY`) from calling agents.
 **Why:** The engine and MCP server have different concerns — the engine manages embeddings, search, and ingestion; the MCP server translates MCP protocol to engine API calls. Keeping them separate means either can be updated independently. Both run as long-lived containers in a Docker Compose stack.
 Streamable HTTP (not stdio) because the MCP server is a network service that remote agents connect to, not a subprocess spawned by a local agent. This matches the deployment model: engine + MCP server run on an infrastructure host, agents connect over the network.
 The MCP server must have its own authentication because it is HTTP-exposed. Without it, anyone who discovers the endpoint has a direct pipe to the engine via `KB_API_KEY`. The MCP server validates the agent's Bearer token (`KB_MCP_API_KEY`) before proxying requests to the engine.
 **Alternative considered:** Embedding MCP into the FastAPI app as additional routes. Rejected — it couples the MCP SDK lifecycle to the engine, and the engine shouldn't need to know about MCP protocol details. Also considered stdio transport, rejected because it requires the agent and MCP server to share a host. Also considered relying solely on the engine's `KB_API_KEY` for auth. Rejected — the MCP server is a separate network surface and must authenticate its own callers.
 **Implementation:** Separate Python package/directory (`mcp/` at repo root). Uses the `mcp` Python SDK with Streamable HTTP transport. Reads engine URL and engine API key from environment variables (`KB_ENGINE_URL`, `KB_API_KEY`). Reads its own auth token from `KB_MCP_API_KEY`. Makes HTTP calls to the engine using `httpx`. Docker Compose file adds the MCP server as a service alongside the engine.
 ### D2 — Collections via tag conventions, with MCP-enforced exclusivity
 Collections are implemented using the existing tag system with a naming convention: `collection:documents`, `collection:memory`, `collection:workspace`.
 **Why:** Tags already exist, already filter search, and are already mutable via the API. A dedicated `collection` column would add a schema migration, new API parameters, and new CLI flags — all duplicating what tags can do.
 **Exclusive membership:** The MCP server enforces one collection per document. When adding a document to a collection, the MCP server first removes any existing `collection:*` tags via the engine's tag API, then applies the new one. This prevents a document from appearing in multiple collections and keeps search results clean.
 **Tag stripping in MCP responses:** The MCP server strips `collection:*` tags from the `tags` array in search results and presents the collection as a separate `collection` field. Agents see a clean interface: `{"collection": "memory", "tags": ["feedback", "email"]}` rather than raw `collection:memory` mixed in with user tags.
 **Implementation:** The MCP tools accept a `collection` parameter (e.g. `"memory"`). The MCP server translates this to tag operations:
 - On search: adds `collection:<name>` to the tag filter
 - On addnote/addfile: removes any existing `collection:*` tags, then applies `collection:<name>`
 - On results: strips `collection:*` from tags, adds a `collection` field
 The engine is unchanged. The Go CLI can use the same convention manually via `--tags collection:memory`.
 **Convention:** `collection:documents` is the default. Standard names: `documents`, `memory`, `workspace`. The MCP tool descriptions document these.
 ### D3 — Note mutation via dedicated PATCH endpoint, with full chunking support
 Note updates go through a new synchronous `PATCH /api/v1/notes/{id}` endpoint, not through the async job queue. The endpoint uses the same chunking logic as the ingestion pipeline, not a hardcoded single-chunk assumption.
 **Why:** Most notes are short and produce a single chunk. But if an agent updates a note with text that exceeds the embedding model's token window (~256 tokens for MiniLM), a single-chunk approach would silently embed only a portion of the text. Using the standard note chunking pipeline (which today produces one chunk for typical notes) means the endpoint naturally handles longer notes without silent data loss.
 **Alternative considered:** Truncating long notes and returning a warning. Rejected — silent data loss or warnings that the agent might ignore are worse than just doing the right thing. Also considered reusing the job queue for consistency. Rejected — the queue's value is async processing of heavy workloads. Notes don't need it.
 **Implementation:** The PATCH endpoint:
 1. Validates the document exists and is `doc_type = 'note'`
 2. Deletes existing chunks, FTS entries, and vector embeddings for that document
 3. Runs the new text through the note chunking pipeline (same as ingestion)
 4. Embeds each chunk and inserts into chunks_vec
 5. Updates the document's `content_hash` and `updated_at`
 6. Returns the updated document
 All within a single transaction. FTS5 triggers keep the full-text index in sync automatically (existing `chunks_au` and `chunks_ad` triggers handle this). If embedding fails, the transaction rolls back and the old note is preserved.
 ### D4 — `updated_at` column on documents, set only on mutation
 A new `updated_at TEXT` column on `documents`, initially NULL for all existing documents. Set to `current_timestamp` only when a document is modified (note update, tag change).
 **Why:** Distinguishes "created" from "last modified". The agent memory use case needs to know when a memory was last updated, not just when it was first created. NULL means "never updated" — cleaner than duplicating `created_at`.
 **Date sorting:** Any query that sorts or filters by "most recent" must use `COALESCE(updated_at, created_at)` to ensure un-mutated documents don't disappear from recent lists. This applies to the documents list endpoint and any future "recent" views.
 ### D5 — File upload via chunked base64, proxied to engine's existing upload API
 The MCP server supports file uploads from remote agents using a three-step chunked upload pattern:
 1. `kb_upload_start(filename, total_size, tags, collection)` — creates a temporary staging entry on the MCP server, returns a server-generated UUID `upload_id`
 2. `kb_upload_chunk(upload_id, data, chunk_index)` — appends a base64-encoded chunk to the staging entry. Called N times.
 3. `kb_upload_finish(upload_id)` — reassembles chunks, decodes from base64, and forwards the complete file as a multipart upload to the engine's existing `POST /api/v1/jobs` endpoint. Returns the job ID.
 **Why:** The MCP server is remote from the calling agent, so file paths are meaningless. The agent reads the file locally, splits it into chunks, base64-encodes each chunk, and sends them as individual tool calls. No single MCP message needs to carry the entire file, avoiding message size limits regardless of file size.
 The engine's existing upload pipeline handles everything from there: staging, type detection, chunking, embedding. No new engine code needed for file transfer.
 **Alternative considered:** Single-message base64 upload (`kb_addfile` with full file content). Rejected — works for small files but hits practical MCP message size limits on larger PDFs. Also considered a separate file transfer service (SFTP container). Rejected — adds operational complexity for no benefit over the chunked approach. Also considered a plain HTTP upload endpoint on the MCP server. Rejected — adds a second protocol surface the agent needs to interact with. Also considered a single-call shortcut for small files. Rejected — one path for all files is simpler for agents to learn, and the overhead of 3 calls vs 1 is negligible for an LLM.
 **Upload ID:** Server-generated UUID, returned by `kb_upload_start`. Prevents collision and is unpredictable (important since the MCP server is network-exposed).
 **Chunk size:** Recommended 1MB raw (before base64 encoding, ~1.33MB encoded) per chunk. A 10MB PDF = ~10 tool calls. The MCP server holds chunks in a temporary directory, cleans up on finish or after a timeout (e.g. 10 minutes for abandoned uploads).
 **Staging cleanup:** The MCP server tracks active uploads in memory. Chunks are written to a temporary directory. On `kb_upload_finish`, chunks are assembled and forwarded. On timeout or error, the temporary files are cleaned up. No persistent state needed — abandoned uploads are simply garbage collected. The temp directory does not need to survive container restarts; if the MCP server restarts mid-upload, the agent retries from `kb_upload_start`.
 ### D6 — MCP tool descriptions include agent-side search patterns
 The MCP tool descriptions for `kb_search` include guidance on query expansion and reranking as documented patterns, not as engine parameters.
 **Why:** The calling agent has an LLM. Expanding queries (call search N times with variant phrasings, merge results) and reranking (read top results, reorder by relevance) are better done in the agent's context. This keeps the engine deterministic and local.
 **Implementation:** The `kb_search` tool description includes a note like: *"For complex queries, consider expanding into 2-3 variant phrasings and calling this tool multiple times, then deduplicating results by chunk_id. For precision, rerank the returned results using your own judgement."*
 ### D7 — Version bump to 3.0.0 for both engine and client
 Engine and client both bump to v3.0.0. MIN_ENGINE_VERSION updates to v3.0.0.
 **Why:** The `updated_at` column is a schema addition and the new `PATCH /api/v1/notes/{id}` endpoint is a new API surface. The new client command (`updatenote`) requires the new engine. A major version bump signals this clearly. The clean break is worth it given the MCP server is a new integration paradigm.
 ## Risks / Trade-offs
 **MCP SDK maturity** — The `mcp` Python SDK is relatively new. Breaking changes in the SDK could require MCP server updates. Mitigation: the MCP server is a thin adapter, so updating it is low cost. Pin the SDK version.
 **Tag convention enforcement** — Collection tags are a convention, not a constraint at the engine level. Typos create new collections silently (e.g. `collection:memeory`). Mitigation: the MCP server enforces exclusivity (removes old `collection:*` tags before applying new) and validates collection names against a known list. The Go CLI does not enforce this — it's a convention for manual users. Direct engine API users can still create arbitrary tags.
 **Note mutation with long text** — The PATCH endpoint uses the standard note chunking pipeline, so long notes are chunked correctly. However, a note that grows very large (thousands of tokens) will produce many chunks and embeddings, making the synchronous PATCH slower. Mitigation: for the agent memory use case, notes are typically short. If a note grows large enough for this to matter, the agent should consider splitting it into multiple notes.
 **Chunked upload complexity** — The three-step upload pattern (start/chunk/finish) is more complex than a single tool call. An agent must make N+2 calls to upload a file. Mitigation: the pattern is deterministic and easily scripted by agents. The MCP tool descriptions will include a clear usage example. Abandoned uploads (agent crashes mid-upload) are cleaned up by a timeout on the MCP server — no permanent state leaks.
 **MCP server as HTTP client** — The MCP server calls the engine over HTTP, adding a network hop. For a compose deployment (both containers on the same Docker network) this adds sub-millisecond latency per call. Acceptable.
 ## Migration Plan
 1. **Engine schema migration** — runs automatically on startup (same pattern as existing migrations in `init_schema`):
   - `ALTER TABLE documents ADD COLUMN updated_at TEXT`
 2. **New engine endpoint** — `PATCH /api/v1/notes/{id}` for note mutation
 3. **Engine version bump** — update `engine/VERSION` to `3.0.0`
 4. **Client updates** — new `updatenote` command, version bump to `3.0.0`, `MIN_ENGINE_VERSION` to `3.0.0`
 5. **MCP server** — new `mcp/` directory, Dockerfile, added to Docker Compose
 6. **Rollback** — the schema change is additive (one new column). Rolling back to v2 engine code works fine — v2 ignores `updated_at`. Rolling back the client is a binary swap. Removing the MCP server container has no effect on engine or CLI.
@@ -0,0 +1,81 @@
 ## Why
 The kb engine exposes a well-structured REST API, but agent integration today goes through a Claude Code skill that shells out to the Go CLI binary, parses JSON output, and re-synthesises results. This works but is indirect: subprocess overhead on every call, fragile output parsing, no streaming, and no composability with other MCP tools in the same session. As agents increasingly rely on kb for both document retrieval and memory storage, this friction compounds.
 At the same time, there is no way to scope searches to "agent memory" vs "user documents" without careful manual tagging, and no way to update an existing note in place without delete + re-add. These gaps cause agents to accumulate stale duplicates and pollute the user's document index with internal memory notes.
 kb v3 adds an MCP server as a new integration surface alongside the existing CLI, establishes collection tag conventions for scoped search, and adds note mutation to support the agent memory use case natively.
 ## What Changes
 ### 1. MCP Server (new component)
 A Model Context Protocol server that exposes kb operations as native MCP tools. Runs as a separate Docker container alongside the engine, using Streamable HTTP transport. Translates MCP tool calls into engine HTTP API calls.
 **MCP tool surface:**
 | MCP Tool | Maps to Engine API | Notes |
 |---|---|---|
 | `kb_search` | `POST /api/v1/search` | Query, top_n, tags, doc_type, collection, mode |
 | `kb_addnote` | `POST /api/v1/jobs` | Body text, tags, collection (default: `documents`) |
 | `kb_upload_start` | _(MCP server internal)_ | Start chunked upload: filename, size, tags, collection → returns upload_id |
 | `kb_upload_chunk` | _(MCP server internal)_ | Append base64 chunk to staging: upload_id, data, chunk_index |
 | `kb_upload_finish` | `POST /api/v1/jobs` | Reassemble chunks, decode, proxy as multipart upload → returns job_id |
 | `kb_update_note` | `PATCH /api/v1/notes/{id}` | Replace note text, re-chunk and re-embed in place |
 | `kb_get` | `GET /api/v1/documents` | Retrieve by document ID or source_path |
 | `kb_status` | `GET /api/v1/status` | Index health, doc counts, model info, queue state |
 | `kb_jobs` | `GET /api/v1/jobs` | Check ingestion queue status |
 The `collection` parameter on search/addnote/addfile is translated by the MCP server into tag filters using the convention `collection:<name>` (e.g. `collection:memory`). No engine changes required for collections.
 ### 2. Collection Tag Conventions (no engine changes)
 Scoped document organisation using existing tags with a naming convention.
 - Convention: `collection:documents` (default), `collection:memory`, `collection:workspace`
 - MCP tools accept a `collection` parameter and translate to tag operations
 - The Go CLI can use the same convention via `--tags collection:memory`
 - No new schema, no new API parameters on the engine — uses existing tag infrastructure
 ### 3. Note Mutation (engine extension)
 Allow existing notes to be updated in place without delete + re-add.
 - `PATCH /api/v1/notes/{id}` endpoint — accepts new text, re-chunks and re-embeds
 - Preserves original `created_at`, updates `updated_at`
 - `kb updatenote <id> "new text"` CLI command
 - `kb_update_note` MCP tool
 ### 4. Agent-Side Search Patterns (no engine changes)
 Query expansion and reranking are **caller responsibilities**, not engine features. The calling agent already has an LLM — adding one inside the engine would duplicate capability, introduce a cloud API dependency into a fully local system, and complicate testing.
 **Query expansion** — the agent expands its query into 2-3 variant phrasings, makes multiple `kb_search` calls, and merges/deduplicates results in its own context. The MCP tool descriptions should document this as a recommended pattern for complex natural-language questions.
 **Reranking** — the agent reads the top N search results and applies its own judgement to reorder by relevance. This is what agents already do when synthesising answers from retrieved chunks.
 These patterns should be documented in the MCP tool descriptions and the kb skill guidance, not implemented as engine features.
 ## Capabilities
 ### New Capabilities
 - `mcp-server`: MCP protocol server exposing kb tools (search, addnote, chunked file upload, update_note, get, status, jobs) for native agent integration. Runs as a Docker container with Streamable HTTP transport. Calls engine HTTP API internally. File uploads use a three-step chunked pattern (start → chunk × N → finish) to avoid message size limits, then proxy to the engine's existing upload endpoint.
 - `note-mutation`: In-place update of existing notes. New PATCH endpoint re-chunks and re-embeds while preserving document identity and creation timestamp.
 - `agent-search-patterns`: Documented patterns for agent-side query expansion (multi-query + merge) and reranking (LLM-based result reordering). No engine changes — these are caller responsibilities, documented in MCP tool descriptions and skill guidance.
 ### Modified Capabilities
 - `engine-api`: New endpoint for note mutation (`PATCH /api/v1/notes/{id}`). `documents` table gains `updated_at` column.
 - `go-client`: New `updatenote` command.
 ## Impact
 - **Code — new**: `mcp/` directory — MCP server package. Thin adapter translating MCP tool calls to engine HTTP API calls, with base64 file upload decoding.
 - **Code — engine**: `kb/database.py` — add `updated_at` column, migration logic. New `kb/routes/notes.py` for PATCH endpoint.
 - **Code — client**: New `cmd/updatenote.go`. `internal/api/client.go` for new endpoint.
 - **APIs**: New `PATCH /api/v1/notes/{id}`.
 - **Dependencies**: MCP Python SDK (`mcp` package) and `httpx` for the MCP server.
 - **Systems**: MCP server added to Docker Compose stack. Agents connect to it via Streamable HTTP.
 - **Data**: SQLite schema migration — `updated_at TEXT` column on `documents` table. Non-destructive.
 - **Versioning**: Engine bumps to v3.0.0 (new endpoint + schema). Client bumps to v3.0.0 (new command). MIN_ENGINE_VERSION updated to v3.0.0.
@@ -0,0 +1,35 @@
 # Agent-Side Search Patterns
 ## Purpose
 Documents recommended patterns for agent-side query expansion and reranking, which are caller responsibilities rather than engine features. These patterns are communicated via MCP tool descriptions.
 ## Requirements
 ### Requirement: Query expansion guidance in tool description
 The `kb_search` MCP tool description SHALL include guidance on query expansion as a recommended pattern for complex queries.
 #### Scenario: Tool description includes expansion pattern
 - **WHEN** an agent reads the `kb_search` tool description
 - **THEN** the description SHALL include guidance such as: "For complex queries, consider expanding into 2-3 variant phrasings and calling this tool multiple times, then deduplicating results by chunk_id"
 ---
 ### Requirement: Reranking guidance in tool description
 The `kb_search` MCP tool description SHALL include guidance on agent-side reranking as a recommended pattern for improving precision.
 #### Scenario: Tool description includes reranking pattern
 - **WHEN** an agent reads the `kb_search` tool description
 - **THEN** the description SHALL include guidance such as: "For precision, rerank the returned results using your own judgement based on relevance to the original question"
 ---
 ### Requirement: No engine-side LLM dependency
 The engine SHALL NOT require or use any external LLM API for search operations. Query expansion and reranking SHALL remain entirely agent-side concerns.
 #### Scenario: Engine has no LLM dependency
 - **WHEN** the engine is deployed without any `ANTHROPIC_API_KEY` or similar LLM API configuration
 - **THEN** all search operations SHALL function fully, with no degraded results or missing features
@@ -0,0 +1,79 @@
 # Engine API (Delta)
 ## ADDED Requirements
 ### Requirement: Note mutation endpoint
 The engine SHALL provide a `PATCH /api/v1/notes/{id}` endpoint for updating existing notes in place. See the `note-mutation` spec for full details.
 #### Scenario: Note update endpoint exists
 - **WHEN** a client sends `PATCH /api/v1/notes/42` with body `{"text": "new content"}`
 - **THEN** the engine SHALL process the update synchronously and return the updated document
 ---
 ### Requirement: Document updated_at tracking
 The engine SHALL track when documents are modified via an `updated_at` column. This column SHALL be NULL for documents that have never been updated.
 #### Scenario: New document has no updated_at
 - **WHEN** a document is first ingested
 - **THEN** `updated_at` SHALL be NULL and `created_at` SHALL be set to the ingestion timestamp
 #### Scenario: Note update sets updated_at
 - **WHEN** a note is updated via `PATCH /api/v1/notes/{id}`
 - **THEN** `updated_at` SHALL be set to the current timestamp
 #### Scenario: Tag change sets updated_at
 - **WHEN** tags are modified via `PUT /api/v1/documents/{id}/tags`
 - **THEN** `updated_at` SHALL be set to the current timestamp
 #### Scenario: Schema migration for updated_at
 - **WHEN** the engine starts against a v2 database without an `updated_at` column
 - **THEN** the engine SHALL automatically add `ALTER TABLE documents ADD COLUMN updated_at TEXT` and all existing documents SHALL have `updated_at = NULL`
 ## MODIFIED Requirements
 ### Requirement: Document management
 The engine SHALL provide endpoints to list, inspect, remove, and download original files for ingested documents.
 #### Scenario: List documents
 - **WHEN** a client sends `GET /api/v1/documents`
 - **THEN** the engine SHALL return a JSON array of documents with id, title, doc_type, tags, chunk_count, created_at, and updated_at
 #### Scenario: List documents with filters
 - **WHEN** a client sends `GET /api/v1/documents?type=pdf&tags=manual`
 - **THEN** the engine SHALL return only documents matching all specified filters
 #### Scenario: List documents sorted by most recent
 - **WHEN** a client requests documents sorted by date
 - **THEN** the engine SHALL use `COALESCE(updated_at, created_at)` for ordering, so un-mutated documents sort by creation time and mutated documents sort by their last update
 #### Scenario: Get document details
 - **WHEN** a client sends `GET /api/v1/documents/{id}`
 - **THEN** the engine SHALL return the full document record including all chunks, their text content, `updated_at`, and whether the original file is available (`has_file: true/false`)
 #### Scenario: Download original file
 - **WHEN** a client sends `GET /api/v1/documents/{id}/file`
 - **THEN** the engine SHALL return the original file with appropriate Content-Type and `Content-Disposition: attachment; filename="{original_filename}"` headers, or HTTP 404 if the file is not available
 #### Scenario: Remove a document
 - **WHEN** a client sends `DELETE /api/v1/documents/{id}`
 - **THEN** the engine SHALL delete the document, all its chunks, associated embeddings, tag associations, and the stored original file from disk, and return HTTP 200 with a confirmation
 #### Scenario: Remove non-existent document
 - **WHEN** a client sends `DELETE /api/v1/documents/{id}` with a non-existent ID
 - **THEN** the engine SHALL return HTTP 404
 ### Requirement: Engine status and reindex
 The engine SHALL provide status information and support re-embedding all chunks. The `version` field in the status response SHALL always be present and SHALL reflect the engine's release version as read from the `VERSION` file. This field is the contract used by clients for compatibility checking.
 #### Scenario: Get engine status
 - **WHEN** a client sends `GET /api/v1/status`
 - **THEN** the engine SHALL return JSON with `version` (string, from VERSION file), model_name, embedding_dim, GPU device info, database stats (document count by type, total chunks, DB size), and queue stats (queued/processing job count)
 #### Scenario: Trigger reindex
 - **WHEN** a client sends `POST /api/v1/reindex`
 - **THEN** the engine SHALL re-embed all existing chunks using the `enriched_text` column and the currently loaded model, and return progress information. This operation SHALL NOT block search queries.
@@ -0,0 +1,57 @@
 # Go Client (Delta)
 ## ADDED Requirements
 ### Requirement: Update note command
 The client SHALL provide a `kb updatenote <id> <text>` command that updates an existing note's content via the engine's `PATCH /api/v1/notes/{id}` endpoint.
 #### Scenario: Update a note
 - **WHEN** the user runs `kb updatenote 42 "Updated note content"`
 - **THEN** the client SHALL send `PATCH /api/v1/notes/42` with body `{"text": "Updated note content"}` and display the result
 #### Scenario: Update a note with JSON output
 - **WHEN** the user runs `kb updatenote 42 "new content" --format json`
 - **THEN** the client SHALL output the raw JSON response from the engine
 #### Scenario: Update a non-existent document
 - **WHEN** the user runs `kb updatenote 999 "text"` and the engine returns HTTP 404
 - **THEN** the client SHALL display an error indicating the document was not found and exit with a non-zero code
 #### Scenario: Update a non-note document
 - **WHEN** the user runs `kb updatenote 42 "text"` and the engine returns HTTP 422
 - **THEN** the client SHALL display an error indicating that only notes can be updated and exit with a non-zero code
 #### Scenario: Missing arguments
 - **WHEN** the user runs `kb updatenote` or `kb updatenote 42` with insufficient arguments
 - **THEN** the client SHALL display usage help indicating that both document ID and text are required
 ## MODIFIED Requirements
 ### Requirement: Engine version compatibility check
 The client SHALL verify that the connected engine meets a minimum version requirement before executing any API command. The minimum required engine version SHALL be embedded in the client binary at build time. If the engine version is below the minimum, the client SHALL print an error message and exit with a non-zero code. There SHALL be no flag to skip or suppress this check.
 #### Scenario: Compatible engine version
 - **WHEN** the client connects to an engine reporting version `3.0.0` and `MinEngineVersion` is `3.0.0`
 - **THEN** the client SHALL proceed with the command normally
 #### Scenario: Incompatible engine version
 - **WHEN** the client connects to an engine reporting version `2.1.0` and `MinEngineVersion` is `3.0.0`
 - **THEN** the client SHALL print to stderr: `Error: kb client vX.Y.Z requires engine v3.0.0+ (connected engine is v2.1.0)` followed by an upgrade hint, and exit with code 1
 #### Scenario: Engine unreachable during version check
 - **WHEN** the client cannot reach the engine's `/api/v1/status` endpoint
 - **THEN** the client SHALL skip the version check and proceed with the original command (the actual API call will surface the connectivity error)
 #### Scenario: Version check is cached per session
 - **WHEN** the client has already verified engine compatibility during the current invocation
 - **THEN** subsequent API calls within the same invocation SHALL NOT repeat the version check
 #### Scenario: Client version command does not check engine
 - **WHEN** the user runs `kb --version`
 - **THEN** the client SHALL print the client version without contacting the engine
 #### Scenario: MinEngineVersion not set
 - **WHEN** the client binary has `MinEngineVersion` set to empty string or `dev`
 - **THEN** the client SHALL skip the version check entirely (development builds)
@@ -0,0 +1,205 @@
 # MCP Server
 ## Purpose
 The MCP server provides a Model Context Protocol interface to the kb engine, exposing knowledge base operations as native MCP tools over Streamable HTTP transport. It runs as a separate Docker container alongside the engine, translating MCP tool calls into engine HTTP API calls.
 ## Requirements
 ### Requirement: MCP server transport and deployment
 The MCP server SHALL expose tools via Streamable HTTP transport. It SHALL run as a Docker container, configured to connect to the kb engine's HTTP API. It SHALL read `KB_ENGINE_URL` and `KB_API_KEY` from environment variables to connect to the engine.
 #### Scenario: MCP server starts and connects to engine
 - **WHEN** the MCP server container starts with `KB_ENGINE_URL=http://engine:8000` and `KB_API_KEY=secret`
 - **THEN** it SHALL begin accepting MCP connections over Streamable HTTP and use the configured URL and API key for all engine API calls
 #### Scenario: Engine unreachable at startup
 - **WHEN** the MCP server starts but cannot reach the engine at `KB_ENGINE_URL`
 - **THEN** it SHALL start and accept connections, but tool calls SHALL return errors indicating the engine is unreachable
 #### Scenario: Docker Compose deployment
 - **WHEN** the MCP server is deployed via Docker Compose alongside the engine
 - **THEN** it SHALL connect to the engine via the Docker network using the service name (e.g. `http://engine:8000`)
 ---
 ### Requirement: MCP server authentication
 The MCP server SHALL require Bearer token authentication from calling agents via the `KB_MCP_API_KEY` environment variable. This is independent of the engine's `KB_API_KEY`.
 #### Scenario: Valid MCP API key
 - **WHEN** `KB_MCP_API_KEY` is set and a calling agent provides a matching Bearer token
 - **THEN** the MCP server SHALL process the request normally
 #### Scenario: Missing MCP API key when required
 - **WHEN** `KB_MCP_API_KEY` is set and a calling agent connects without a Bearer token
 - **THEN** the MCP server SHALL reject the connection with an authentication error
 #### Scenario: Invalid MCP API key
 - **WHEN** `KB_MCP_API_KEY` is set and a calling agent provides a non-matching Bearer token
 - **THEN** the MCP server SHALL reject the connection with an authentication error
 #### Scenario: MCP auth disabled
 - **WHEN** `KB_MCP_API_KEY` is not set
 - **THEN** the MCP server SHALL accept all connections without authentication
 ---
 ### Requirement: Search tool
 The MCP server SHALL expose a `kb_search` tool that queries the knowledge base via the engine's search API.
 #### Scenario: Basic search
 - **WHEN** an agent calls `kb_search` with `{"query": "pension revaluation", "top": 5}`
 - **THEN** the MCP server SHALL POST to the engine's `/api/v1/search` endpoint and return the results with chunk text, scores, document metadata, and tags
 #### Scenario: Search with collection filter
 - **WHEN** an agent calls `kb_search` with `{"query": "email preferences", "collection": "memory"}`
 - **THEN** the MCP server SHALL add `collection:memory` to the tags filter and POST to the engine's search endpoint
 #### Scenario: Search with tags and collection
 - **WHEN** an agent calls `kb_search` with `{"query": "feedback", "tags": ["email"], "collection": "memory"}`
 - **THEN** the MCP server SHALL combine the explicit tags with `collection:memory` in the tag filter
 #### Scenario: Search results strip collection tags
 - **WHEN** the engine returns search results containing tags `["collection:memory", "feedback", "email"]`
 - **THEN** the MCP server SHALL strip `collection:*` tags from the `tags` array and add a separate `collection` field, returning `{"collection": "memory", "tags": ["feedback", "email"], ...}`
 #### Scenario: Search with mode override
 - **WHEN** an agent calls `kb_search` with `{"query": "error log", "fts_only": true}`
 - **THEN** the MCP server SHALL pass `fts_only: true` to the engine search endpoint
 ---
 ### Requirement: Add note tool
 The MCP server SHALL expose a `kb_addnote` tool that submits a text note to the engine for ingestion.
 #### Scenario: Add a note with default collection
 - **WHEN** an agent calls `kb_addnote` with `{"text": "User prefers concise responses"}`
 - **THEN** the MCP server SHALL submit the note to the engine's `POST /api/v1/jobs` endpoint with the tag `collection:documents` and return the job ID
 #### Scenario: Add a note to a specific collection
 - **WHEN** an agent calls `kb_addnote` with `{"text": "User prefers concise responses", "collection": "memory", "tags": ["feedback"]}`
 - **THEN** the MCP server SHALL submit the note with tags `["collection:memory", "feedback"]` to the engine
 #### Scenario: Add a note to a collection replaces existing collection tag
 - **WHEN** an agent calls `kb_addnote` with `{"text": "some note", "collection": "memory"}` and the note is ingested
 - **THEN** the resulting document SHALL have exactly one `collection:*` tag: `collection:memory`
 ---
 ### Requirement: Chunked file upload tools
 The MCP server SHALL expose a three-step chunked file upload pattern for transferring files from remote agents to the engine.
 #### Scenario: Start an upload
 - **WHEN** an agent calls `kb_upload_start` with `{"filename": "report.pdf", "total_size": 5242880, "tags": ["insurance"], "collection": "documents"}`
 - **THEN** the MCP server SHALL create a staging entry, generate a UUID `upload_id`, and return `{"upload_id": "<uuid>"}`
 #### Scenario: Upload a chunk
 - **WHEN** an agent calls `kb_upload_chunk` with `{"upload_id": "<uuid>", "data": "<base64-encoded-data>", "chunk_index": 0}`
 - **THEN** the MCP server SHALL decode the base64 data and write it to the staging area for the given upload
 #### Scenario: Upload multiple chunks in sequence
 - **WHEN** an agent calls `kb_upload_chunk` multiple times with sequential `chunk_index` values for the same `upload_id`
 - **THEN** the MCP server SHALL store each chunk and track the sequence
 #### Scenario: Finish an upload
 - **WHEN** an agent calls `kb_upload_finish` with `{"upload_id": "<uuid>"}`
 - **THEN** the MCP server SHALL reassemble the chunks in order, forward the complete file as a multipart upload to the engine's `POST /api/v1/jobs` endpoint with the tags from `kb_upload_start` (including `collection:<name>`), and return the job ID
 #### Scenario: Upload with invalid upload_id
 - **WHEN** an agent calls `kb_upload_chunk` or `kb_upload_finish` with an `upload_id` that does not exist
 - **THEN** the MCP server SHALL return an error indicating the upload ID is not found
 #### Scenario: Abandoned upload cleanup
 - **WHEN** an agent starts an upload but does not call `kb_upload_finish` within 10 minutes
 - **THEN** the MCP server SHALL clean up the staged chunks and remove the upload tracking entry
 #### Scenario: MCP server restart during upload
 - **WHEN** the MCP server container restarts while an upload is in progress
 - **THEN** the in-progress upload SHALL be lost and the agent SHALL need to restart from `kb_upload_start`
 ---
 ### Requirement: Update note tool
 The MCP server SHALL expose a `kb_update_note` tool that updates an existing note in place via the engine's note mutation endpoint.
 #### Scenario: Update an existing note
 - **WHEN** an agent calls `kb_update_note` with `{"document_id": 42, "text": "Updated preference: user prefers bullet points"}`
 - **THEN** the MCP server SHALL send `PATCH /api/v1/notes/42` to the engine and return the updated document
 #### Scenario: Update a non-existent document
 - **WHEN** an agent calls `kb_update_note` with a `document_id` that does not exist
 - **THEN** the MCP server SHALL return an error indicating the document was not found
 #### Scenario: Update a non-note document
 - **WHEN** an agent calls `kb_update_note` with a `document_id` that refers to a PDF
 - **THEN** the MCP server SHALL return an error indicating that only notes can be updated
 ---
 ### Requirement: Get document tool
 The MCP server SHALL expose a `kb_get` tool that retrieves document details from the engine.
 #### Scenario: Get by document ID
 - **WHEN** an agent calls `kb_get` with `{"document_id": 42}`
 - **THEN** the MCP server SHALL fetch `GET /api/v1/documents/42` and return the document details with chunks
 #### Scenario: Get by source path
 - **WHEN** an agent calls `kb_get` with `{"source_path": "memory/feedback_testing.md"}`
 - **THEN** the MCP server SHALL query the engine's documents endpoint filtered by source path and return matching documents
 #### Scenario: Get results strip collection tags
 - **WHEN** the engine returns document details with tags including `collection:memory`
 - **THEN** the MCP server SHALL strip `collection:*` from tags and present a separate `collection` field
 ---
 ### Requirement: Status tool
 The MCP server SHALL expose a `kb_status` tool that returns engine health and statistics.
 #### Scenario: Get engine status
 - **WHEN** an agent calls `kb_status` with no parameters
 - **THEN** the MCP server SHALL fetch `GET /api/v1/status` and return engine version, model info, device info, document counts, and queue state
 ---
 ### Requirement: Jobs tool
 The MCP server SHALL expose a `kb_jobs` tool that returns ingestion job status.
 #### Scenario: List recent jobs
 - **WHEN** an agent calls `kb_jobs` with no parameters
 - **THEN** the MCP server SHALL fetch `GET /api/v1/jobs` and return the list of recent jobs
 #### Scenario: Filter jobs by status
 - **WHEN** an agent calls `kb_jobs` with `{"status": "failed"}`
 - **THEN** the MCP server SHALL fetch `GET /api/v1/jobs?status=failed` and return matching jobs
 ---
 ### Requirement: Collection management via tags
 The MCP server SHALL manage collections using tag conventions. The MCP server SHALL enforce exclusive collection membership — a document SHALL belong to exactly one collection.
 #### Scenario: Default collection on addnote
 - **WHEN** an agent calls `kb_addnote` without specifying a collection
 - **THEN** the MCP server SHALL apply the tag `collection:documents`
 #### Scenario: Explicit collection on addnote
 - **WHEN** an agent calls `kb_addnote` with `{"collection": "memory"}`
 - **THEN** the MCP server SHALL apply the tag `collection:memory`
 #### Scenario: Exclusive collection enforcement
 - **WHEN** a document already has the tag `collection:documents` and an operation changes its collection to `memory`
 - **THEN** the MCP server SHALL first remove `collection:documents` via the engine's tag API, then add `collection:memory`
 #### Scenario: Collection field in search results
 - **WHEN** search results include documents with `collection:*` tags
 - **THEN** the MCP server SHALL present the collection as a top-level `collection` field and exclude `collection:*` from the `tags` array
@@ -0,0 +1,43 @@
 # Note Mutation
 ## Purpose
 Note mutation allows existing notes to be updated in place without requiring delete and re-add, preserving document identity (ID, creation timestamp) while updating content, embeddings, and the full-text index.
 ## Requirements
 ### Requirement: Note update endpoint
 The engine SHALL provide a `PATCH /api/v1/notes/{id}` endpoint that accepts new text for an existing note, re-chunks and re-embeds it, and returns the updated document.
 #### Scenario: Update an existing note
 - **WHEN** a client sends `PATCH /api/v1/notes/42` with body `{"text": "Updated note content"}`
 - **THEN** the engine SHALL delete existing chunks and embeddings for document 42, run the new text through the note chunking pipeline, generate embeddings for each chunk, insert new chunks and embeddings, update the document's `content_hash` and `updated_at`, and return the updated document with HTTP 200
 #### Scenario: Update preserves document identity
 - **WHEN** a note is updated via PATCH
 - **THEN** the document SHALL retain its original `id` and `created_at` values, and `updated_at` SHALL be set to the current timestamp
 #### Scenario: Update with long text that produces multiple chunks
 - **WHEN** a client sends `PATCH /api/v1/notes/42` with text longer than the embedding model's token window
 - **THEN** the engine SHALL chunk the text using the same note chunking pipeline as ingestion, producing multiple chunks, and embed each chunk separately
 #### Scenario: Update a non-existent document
 - **WHEN** a client sends `PATCH /api/v1/notes/999` and document 999 does not exist
 - **THEN** the engine SHALL return HTTP 404
 #### Scenario: Update a non-note document
 - **WHEN** a client sends `PATCH /api/v1/notes/42` and document 42 has `doc_type = 'pdf'`
 - **THEN** the engine SHALL return HTTP 422 with an error indicating that only notes can be updated via this endpoint
 #### Scenario: Embedding failure during update
 - **WHEN** a client sends `PATCH /api/v1/notes/42` but the embedding step fails
 - **THEN** the engine SHALL roll back the entire transaction, preserving the original note content, chunks, and embeddings, and return HTTP 500
 #### Scenario: FTS5 index updated on note mutation
 - **WHEN** a note is updated via PATCH
 - **THEN** the FTS5 virtual table SHALL be updated via the existing chunk triggers (`chunks_ad` for deletes, `chunks_ai` for inserts), keeping the full-text index consistent with the new content
 #### Scenario: Tags preserved on update
 - **WHEN** a note with tags `["feedback", "collection:memory"]` is updated via PATCH
 - **THEN** the document's tags SHALL be unchanged — only the text content, chunks, and embeddings are replaced
@@ -0,0 +1,90 @@
 ## 1. Engine: Schema Migration & updated_at
 - [x] 1.1 Add `updated_at TEXT` column migration to `init_schema()` in `kb/database.py` (same pattern as existing `ALTER TABLE` migrations)
 - [x] 1.2 Update `insert_document()` to include `updated_at` in returned/stored fields
 - [x] 1.3 Update document list endpoint (`GET /api/v1/documents`) to include `updated_at` in response and use `COALESCE(updated_at, created_at)` for date sorting
 - [x] 1.4 Update document detail endpoint (`GET /api/v1/documents/{id}`) to include `updated_at` in response
 - [x] 1.5 Update tag management endpoint (`PUT /api/v1/documents/{id}/tags`) to set `updated_at = current_timestamp` on tag changes
 ## 2. Engine: Note Mutation Endpoint
 - [x] 2.1 Create `kb/routes/notes.py` with `PATCH /api/v1/notes/{id}` endpoint
 - [x] 2.2 Implement validation: document must exist and have `doc_type = 'note'` (404 / 422 on failure)
 - [x] 2.3 Implement note update logic: delete old chunks/embeddings, run note chunking pipeline, re-embed, insert new chunks, update `content_hash` and `updated_at` — all in a single transaction
 - [x] 2.4 Register the notes router in `engine/main.py`
 - [x] 2.5 Test: update a note and verify chunks, embeddings, FTS index, and `updated_at` are all correctly updated
 - [x] 2.6 Test: verify rollback on embedding failure preserves original note
 ## 3. Engine: Version Bump
 - [x] 3.1 Update `engine/VERSION` to `3.0.0`
 ## 4. Go Client: Update Note Command
 - [x] 4.1 Add `PATCH /api/v1/notes/{id}` method to `internal/api/client.go`
 - [x] 4.2 Create `cmd/updatenote.go` — takes document ID and text as positional args, calls PATCH endpoint, formats output (human/json)
 - [x] 4.3 Handle error cases: 404 (not found), 422 (not a note), missing arguments
 - [x] 4.4 Update `cmd/examples.go` to include `updatenote` usage
 ## 5. Go Client: Version Bump
 - [x] 5.1 Update `client/VERSION` to `3.0.0`
 - [x] 5.2 Update `client/MIN_ENGINE_VERSION` to `3.0.0`
 ## 6. MCP Server: Project Setup
 - [x] 6.1 Create `mcp/` directory at repo root with Python package structure
 - [x] 6.2 Add `mcp` SDK and `httpx` as dependencies (requirements.txt or pyproject.toml)
 - [x] 6.3 Implement config: read `KB_ENGINE_URL`, `KB_API_KEY`, `KB_MCP_API_KEY` from environment
 - [x] 6.4 Implement Streamable HTTP transport setup using `mcp` SDK
 - [x] 6.5 Implement Bearer token authentication for incoming agent connections (`KB_MCP_API_KEY`)
 ## 7. MCP Server: Core Tools
 - [x] 7.1 Implement `kb_search` tool — proxy to engine search API, translate `collection` param to `collection:*` tag filter, strip `collection:*` tags from results and add `collection` field
 - [x] 7.2 Implement `kb_addnote` tool — proxy to engine jobs API, apply `collection:<name>` tag (default `collection:documents`)
 - [x] 7.3 Implement `kb_update_note` tool — proxy to engine `PATCH /api/v1/notes/{id}`
 - [x] 7.4 Implement `kb_get` tool — proxy to engine documents API, support lookup by ID or source_path, strip collection tags from response
 - [x] 7.5 Implement `kb_status` tool — proxy to engine status API
 - [x] 7.6 Implement `kb_jobs` tool — proxy to engine jobs API with optional status filter
 ## 8. MCP Server: Chunked File Upload
 - [x] 8.1 Implement `kb_upload_start` tool — generate UUID, create temp staging directory, store upload metadata (filename, tags, collection) in memory
 - [x] 8.2 Implement `kb_upload_chunk` tool — validate upload_id exists, decode base64, write chunk to staging directory by chunk_index
 - [x] 8.3 Implement `kb_upload_finish` tool — reassemble chunks in order, forward as multipart upload to engine `POST /api/v1/jobs` with tags (including `collection:*`), return job ID, clean up staging
 - [x] 8.4 Implement abandoned upload cleanup — background task that removes uploads older than 10 minutes
 - [x] 8.5 Test: upload a multi-chunk file and verify it arrives at the engine correctly
 ## 9. MCP Server: Collection Management
 - [x] 9.1 Implement exclusive collection enforcement — on addnote/addfile, query document tags, remove any existing `collection:*` tags via engine tag API before applying new one
 - [x] 9.2 Implement collection tag stripping in all tool responses (search results, document details)
 ## 10. MCP Server: Tool Descriptions
 - [x] 10.1 Write `kb_search` tool description including query expansion and reranking guidance
 - [x] 10.2 Write descriptions for all other tools with parameter documentation and usage examples
 - [x] 10.3 Include chunked upload usage example in `kb_upload_start` description
 ## 11. MCP Server: Docker & Compose
 - [x] 11.1 Create `mcp/Dockerfile` — Python base image, install dependencies, run MCP server
 - [x] 11.2 Add MCP server service to Docker Compose file(s) — connect to engine via Docker network, expose Streamable HTTP port
 - [x] 11.3 Document environment variables (`KB_ENGINE_URL`, `KB_API_KEY`, `KB_MCP_API_KEY`) in compose file
 ## 12. Integration Testing
 - [x] 12.1 Test: MCP search with collection filter returns only matching documents
 - [x] 12.2 Test: MCP addnote with collection applies correct tag and enforces exclusivity
 - [x] 12.3 Test: MCP update note preserves document ID and tags, updates content and `updated_at`
 - [x] 12.4 Test: chunked file upload end-to-end (start → chunk × N → finish → verify job created)
 - [x] 12.5 Test: MCP server rejects unauthenticated connections when `KB_MCP_API_KEY` is set
 ## 13. Release
 - [x] 13.1 Build and tag engine Docker images (`engine-v3.0.0-*`)
 - [x] 13.2 Build and tag MCP server Docker image
 - [x] 13.3 Build Go client binaries for all platforms
 - [x] 13.4 Create git tags: `engine-v3.0.0`, `client-v3.0.0`
 - [x] 13.5 Update SKILL.md to reference MCP server as primary agent integration path
@@ -53,6 +53,9 @@ The client SHALL provide a `kb search <query>` command that sends the query to t
 #### Scenario: Human-readable search output
 - **WHEN** the user runs `kb search "how to change oil"`
 - **THEN** the client SHALL POST to `/api/v1/search`, and display results in a human-readable format showing rank, score, document title, page/section, doc type, tags, and a text snippet
 - **THEN** the client SHALL parse search results as flat objects with top-level `title`, `doc_type`, `tags`, `score`, `text`, `chunk_index` fields
 - **THEN** the client SHALL extract `page` from `chunk_metadata` when present (PDF documents)
 - **THEN** the client SHALL extract `section_header` from `chunk_metadata` when present (markdown documents)
 #### Scenario: JSON search output
 - **WHEN** the user runs `kb search "query" --format json`
@@ -66,6 +69,22 @@ The client SHALL provide a `kb search <query>` command that sends the query to t
 - **WHEN** the user runs `kb search "error" --fts-only`
 - **THEN** the client SHALL set `fts_only: true` in the request body
 #### Scenario: PDF result with page number
 - **WHEN** a search result has `chunk_metadata` containing `{"page": 12}`
 - **THEN** the human output SHALL display "Page 12" in the location line
 #### Scenario: Markdown result with section header
 - **WHEN** a search result has `chunk_metadata` containing `{"section_header": "Installation > Prerequisites"}`
 - **THEN** the human output SHALL display "Installation > Prerequisites" in the location line
 #### Scenario: Result with both page and section
 - **WHEN** a search result has `chunk_metadata` containing both `page` and `section_header`
 - **THEN** the human output SHALL display both separated by " / "
 #### Scenario: Result with no location metadata
 - **WHEN** a search result has empty `chunk_metadata` or no page/section keys
 - **THEN** the human output SHALL omit the location line entirely
 ---
 ### Requirement: Add note command
@@ -111,9 +111,11 @@ else
    echo "==> Engine version: $VERSION (no increment)"
 fi
-TAG="engine-v${VERSION}"
+GIT_TAG="engine-v${VERSION}"
 DOCKER_TAG="v${VERSION}"
-echo "    Tag:       $TAG"
+echo "    Git tag:   $GIT_TAG"
 echo "    Image tag: $DOCKER_TAG"
 echo "    Registry:  $IMAGE_BASE"
 echo "    Forge CLI: $FORGE"
 echo "    Dry run:   $DRY_RUN"
@@ -125,8 +127,8 @@ echo ""
 echo "==> Pre-flight checks"
 if [[ "$DRY_RUN" == false ]]; then
-    if git -C "$SCRIPT_DIR" rev-parse "$TAG" &>/dev/null; then
+    if git -C "$SCRIPT_DIR" rev-parse "$GIT_TAG" &>/dev/null; then
-        echo "Error: tag $TAG already exists"
+        echo "Error: tag $GIT_TAG already exists"
        exit 1
    fi
 fi
@@ -148,29 +150,48 @@ fi
 #──────────────────────────────────────────────────────────────────────
 echo "==> Building Docker engine images ($VERSION)"
-NVIDIA_IMAGE="${IMAGE_BASE}/engine:${TAG}-nvidia"
+NVIDIA_IMAGE="${IMAGE_BASE}/engine:${DOCKER_TAG}-nvidia"
-ROCM_IMAGE="${IMAGE_BASE}/engine:${TAG}-rocm"
+ROCM_IMAGE="${IMAGE_BASE}/engine:${DOCKER_TAG}-rocm"
 CPU_IMAGE="${IMAGE_BASE}/engine:${DOCKER_TAG}-cpu"
 NVIDIA_LATEST="${IMAGE_BASE}/engine:latest-nvidia"
 ROCM_LATEST="${IMAGE_BASE}/engine:latest-rocm"
 CPU_LATEST="${IMAGE_BASE}/engine:latest-cpu"
 run docker build -t "$NVIDIA_IMAGE" -t "$NVIDIA_LATEST" -f "$ENGINE_DIR/Dockerfile.nvidia" "$ENGINE_DIR"
 run docker build -t "$ROCM_IMAGE" -t "$ROCM_LATEST" -f "$ENGINE_DIR/Dockerfile.rocm" "$ENGINE_DIR"
 run docker build -t "$CPU_IMAGE" -t "$CPU_LATEST" -f "$ENGINE_DIR/Dockerfile.cpu" "$ENGINE_DIR"
 echo ""
 #──────────────────────────────────────────────────────────────────────
 # 3b. Build Docker MCP server image
 #──────────────────────────────────────────────────────────────────────
 MCP_DIR="$SCRIPT_DIR/mcp"
 if [[ -f "$MCP_DIR/Dockerfile" ]]; then
    echo "==> Building Docker MCP server image ($VERSION)"
    MCP_IMAGE="${IMAGE_BASE}/mcp:${DOCKER_TAG}"
    MCP_LATEST="${IMAGE_BASE}/mcp:latest"
    run docker build -t "$MCP_IMAGE" -t "$MCP_LATEST" -f "$MCP_DIR/Dockerfile" "$MCP_DIR"
    echo ""
 fi
 #──────────────────────────────────────────────────────────────────────
 # 4. Commit, tag, and push
 #──────────────────────────────────────────────────────────────────────
-echo "==> Committing and tagging $TAG"
+echo "==> Committing and tagging $GIT_TAG"
 if [[ "$INCREMENT" == true ]]; then
    run git -C "$SCRIPT_DIR" add "$VERSION_FILE"
    run git -C "$SCRIPT_DIR" commit -m "Bump engine version to $VERSION"
 fi
-run git -C "$SCRIPT_DIR" tag -a "$TAG" -m "Release $TAG"
+run git -C "$SCRIPT_DIR" tag -a "$GIT_TAG" -m "Release $GIT_TAG"
 run git -C "$SCRIPT_DIR" push origin HEAD
-run git -C "$SCRIPT_DIR" push origin "$TAG"
+run git -C "$SCRIPT_DIR" push origin "$GIT_TAG"
 echo ""
@@ -179,7 +200,7 @@ echo ""
 #──────────────────────────────────────────────────────────────────────
 echo "==> Creating release via $FORGE"
-RELEASE_TITLE="Engine $TAG"
+RELEASE_TITLE="Engine $GIT_TAG"
 RELEASE_NOTES="## Docker images
 \`\`\`bash
@@ -188,16 +209,25 @@ docker pull ${NVIDIA_IMAGE}
 # AMD GPU (ROCm)
 docker pull ${ROCM_IMAGE}
 # CPU only
 docker pull ${CPU_IMAGE}
 \`\`\`
 ## MCP server
 \`\`\`bash
 docker pull ${MCP_IMAGE:-${IMAGE_BASE}/mcp:${DOCKER_TAG}}
 \`\`\`"
 if [[ "$FORGE" == "gh" ]]; then
-    run gh release create "$TAG" \
+    run gh release create "$GIT_TAG" \
        --title "$RELEASE_TITLE" \
        --notes "$RELEASE_NOTES"
 elif [[ "$FORGE" == "tea" ]]; then
    run tea release create \
-        --tag "$TAG" \
+        --tag "$GIT_TAG" \
        --title "$RELEASE_TITLE" \
        --note "$RELEASE_NOTES"
 fi
@@ -213,10 +243,21 @@ run docker push "$NVIDIA_IMAGE"
 run docker push "$NVIDIA_LATEST"
 run docker push "$ROCM_IMAGE"
 run docker push "$ROCM_LATEST"
 run docker push "$CPU_IMAGE"
 run docker push "$CPU_LATEST"
 if [[ -n "${MCP_IMAGE:-}" ]]; then
    run docker push "$MCP_IMAGE"
    run docker push "$MCP_LATEST"
 fi
 echo ""
-echo "==> Release $TAG complete!"
+echo "==> Release $GIT_TAG complete!"
 echo ""
 echo "    Images:"
 echo "      $NVIDIA_IMAGE"
 echo "      $ROCM_IMAGE"
 echo "      $CPU_IMAGE"
 if [[ -n "${MCP_IMAGE:-}" ]]; then
    echo "      $MCP_IMAGE"
 fi
Author	SHA1	Message	Date
steve	e39e00a2c0	Add MCP auth status to kb_status and update server instructions - kb_status now returns authenticated: true/false so clients can verify auth - Server instructions mention Bearer token auth requirement - Add .env, .venv/, test_mcp_client.py to .gitignore Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 12:04:12 +01:00
steve	d078af9ad3	Split MCP docs into MCP.md with AI tool setup examples Move MCP server documentation from README into dedicated MCP.md. Add configuration examples for Claude Code, VS Code, Cursor, Windsurf, and JetBrains IDEs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 22:03:41 +01:00
steve	b3dce188e1	Fix version check failing on non-200 status responses When the engine returns 401 (auth required) or other non-200 responses, the version check was parsing the error body, getting an empty version string, and fatally exiting. Now skips the check on non-200 responses and lets the actual API call surface the real error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 21:52:24 +01:00
steve	0dc3065979	Update README for v3.0.0 — add MCP server docs, updatenote, fix version refs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 21:45:31 +01:00
steve	e7136a4a20	Add MCP server, note mutation endpoint, and updated_at tracking (v3.0.0) New MCP server (mcp/) exposes kb operations as native MCP tools over Streamable HTTP with Bearer token auth. Supports collections via tag conventions, chunked file uploads, and agent-side search patterns. Engine gains PATCH /api/v1/notes/{id} for in-place note updates with transactional re-chunk/re-embed, and updated_at column on documents. Go client adds updatenote command and Patch HTTP method. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 21:34:55 +01:00
steve	adeba21712	Bump client version to 2.2.1	2026-04-02 16:18:06 +01:00
steve	2d179af557	Fix search human-mode output to match engine API response The Go client struct expected a nested document object and top-level page/section fields, but the engine returns flat results with metadata in chunk_metadata. This caused empty display for title, type, tags, page, and section in human output mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 16:17:35 +01:00
steve	a6bab5e55e	Add CPU-only Docker image and fix release tag naming - Add Dockerfile.cpu and compose.cpu.yaml for CPU-only deployments - Use sentence-transformers[onnx] + CPU-only torch for ~4x smaller image - Fix release script: separate git tags (engine-v) from Docker tags (v) - Add CPU image to release build/push pipeline - Update README with CPU deployment instructions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 16:02:00 +01:00
`@@ -1 +1 @@`
	`from kb.routes import health, search, jobs, documents, tags, status, reindex, auth`	`from kb.routes import health, search, jobs, documents, tags, status, reindex, auth, notes`