Bump client version to 2.2.1

Fix search human-mode output to match engine API response
The Go client struct expected a nested document object and top-level page/section fields, but the engine returns flat results with metadata in chunk_metadata. This caused empty display for title, type, tags, page, and section in human output mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:18:06 +01:00 · 2026-04-02 16:17:35 +01:00 · 2026-04-02 16:02:00 +01:00
13 changed files with 274 additions and 85 deletions
@@ -2,7 +2,7 @@
 Personal knowledge base with hybrid search (full-text + semantic vector search).
-v2 uses a client-server architecture: a **FastAPI engine** running in Docker (with GPU acceleration) and a lightweight **Go CLI client** that talks to it over HTTP.
+v2 uses a client-server architecture: a **FastAPI engine** running in Docker (with optional GPU acceleration) and a lightweight **Go CLI client** that talks to it over HTTP.
 ## Architecture
@@ -10,7 +10,7 @@ v2 uses a client-server architecture: a **FastAPI engine** running in Docker (wi
 Go CLI (kb) ──HTTP──▶ FastAPI Engine (Docker) ──▶ SQLite + GPU
 ```
- **Engine**: Keeps the embedding model warm in GPU memory. Handles search, ingestion, and document management via REST API. Runs in Docker with NVIDIA or AMD GPU support.
+- **Engine**: Keeps the embedding model warm in memory. Handles search, ingestion, and document management via REST API. Runs in Docker with NVIDIA GPU, AMD GPU (ROCm), or CPU-only support.
 - **Client**: Single static Go binary. No Python, no ML dependencies, instant startup. Talks to the engine over HTTP.
 - **Storage**: Single SQLite database with FTS5 (keyword search) and sqlite-vec (vector search). Portable via bind mount — just copy the data directory between hosts.
@@ -43,49 +43,33 @@ docker run -d --name kb-engine \
  -e KB_API_KEY=your-secret-key \
  --restart unless-stopped \
  docker.dcglab.co.uk/dcg/kb/engine:latest-rocm
 # CPU only (no GPU required — smaller image)
 docker run -d --name kb-engine \
  -p 8000:8000 \
  -v ~/kb-data:/data \
  -e KB_MODEL=all-MiniLM-L6-v2 \
  -e KB_API_KEY=your-secret-key \
  --restart unless-stopped \
  docker.dcglab.co.uk/dcg/kb/engine:latest-cpu
 ```
-Or use a compose file — create `compose.yaml`:
+Or use a compose file from the repo:
 ```yaml
 services:
  kb-engine:
    image: docker.dcglab.co.uk/dcg/kb/engine:latest-nvidia  # or latest-rocm
    runtime: nvidia  # remove for ROCm
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    # For ROCm, replace the above runtime/deploy block with:
    # devices:
    #   - "/dev/kfd"
    #   - "/dev/dri"
    # group_add:
    #   - "video"
    ports:
      - "${KB_PORT:-8000}:8000"
    volumes:
      - ${KB_DATA_PATH:-./data}:/data
    environment:
      - KB_MODEL=${KB_MODEL:-all-MiniLM-L6-v2}
      - KB_DEVICE=${KB_DEVICE:-auto}
      - KB_INGEST_DEVICE=${KB_INGEST_DEVICE:-auto}
      - KB_API_KEY=${KB_API_KEY:-}
      - KB_SEARCH_THRESHOLD=${KB_SEARCH_THRESHOLD:-0.01}
      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-}
    restart: unless-stopped
 ```
 ```bash
-KB_DATA_PATH=~/kb-data docker compose up -d
+# NVIDIA GPU
 KB_DATA_PATH=~/kb-data docker compose -f engine/compose.nvidia.yaml up -d
 # AMD GPU (ROCm)
 KB_DATA_PATH=~/kb-data docker compose -f engine/compose.rocm.yaml up -d
 # CPU only
 KB_DATA_PATH=~/kb-data docker compose -f engine/compose.cpu.yaml up -d
 ```
 See [DEVELOPER.md](DEVELOPER.md) to run the engine from source.
-The engine will download the embedding model on first start (~90MB) and load it onto the GPU. Check readiness:
+The engine will download the embedding model on first start (~90MB) and load it into memory (GPU or CPU). Check readiness:
 ```bash
 curl http://localhost:8000/api/v1/health
@@ -196,7 +180,7 @@ rsync -a ~/kb-data/ user@target:/home/user/kb-data/
 KB_DATA_PATH=~/kb-data docker compose -f compose.nvidia.yaml up -d
 ```
-Data is GPU-vendor-agnostic — you can ingest on NVIDIA and serve from AMD (or vice versa) with the same data directory.
+Data is device-agnostic — you can ingest on NVIDIA and serve from AMD or CPU (or any combination) with the same data directory.
 ## Claude Code skill
@@ -79,6 +79,12 @@ kb jobs --status failed --format json    # filter by status
 kb jobs <job_id> --format json           # job details
 ```
 ## Examples
 ```bash
 kb examples                              # show common usage examples
 ```
 ## Engine status and maintenance
 ```bash
@@ -102,19 +108,15 @@ All commands support:
    {
      "chunk_id": 1423,
      "score": 0.031,
      "score_breakdown": {"fts": 0.016, "vector": 0.015},
      "text": "To install the latest version of git from source...",
      "source": {
        "document_id": 42,
        "title": "Git Admin Guide",
        "path": "/home/user/docs/git-admin.pdf",
        "type": "pdf",
        "page": 12,
      "chunk_index": 3,
-        "total_chunks": 28,
+      "chunk_metadata": {"page": 12},
      "title": "Git Admin Guide",
      "doc_type": "pdf",
      "source_path": "/home/user/docs/git-admin.pdf",
      "created_at": "2026-03-15T10:30:00",
      "tags": ["git", "admin"]
    }
    }
  ],
  "total_matches": 47,
  "returned": 10
@@ -160,7 +162,7 @@ Use filters when the question implies a specific domain:
 - Always use `--format json` for machine parsing
 - The `score` field is relative, not absolute — compare scores within a result set
- `source.page` is only present for PDF documents
+- `chunk_metadata.page` is only present for PDF documents
- `source.section_header` is only present for markdown documents with headers
+- `chunk_metadata.section_header` is only present for markdown documents with headers
 - Results are already ranked by relevance (hybrid FTS + vector search)
 - Duplicate files are detected at upload time (HTTP 409) — the client handles this gracefully
@@ -1 +1 @@
-2.2.0
+2.2.1
@@ -68,13 +68,10 @@ func runSearch(cmd *cobra.Command, args []string) error {
 	var result struct {
 		Results []struct {
 			Score         float64                `json:"score"`
 			Document struct {
 			Title         string                 `json:"title"`
-				Type  string `json:"doc_type"`
+			DocType       string                 `json:"doc_type"`
 			Tags          []string               `json:"tags"`
-			} `json:"document"`
+			ChunkMetadata map[string]interface{} `json:"chunk_metadata"`
 			Page    interface{} `json:"page"`
 			Section string      `json:"section"`
 			Text          string                 `json:"text"`
 		} `json:"results"`
 	}
@@ -103,26 +100,28 @@ func runSearch(cmd *cobra.Command, args []string) error {
 			snippet = snippet[:200] + "..."
 		}
-		fmt.Printf("\n%d. [%.4f] %s\n", i+1, r.Score, r.Document.Title)
+		fmt.Printf("\n%d. [%.4f] %s\n", i+1, r.Score, r.Title)
 		location := ""
-		if r.Page != nil {
+		if page, ok := r.ChunkMetadata["page"]; ok && page != nil {
-			location = fmt.Sprintf("Page %v", r.Page)
+			location = fmt.Sprintf("Page %v", page)
 		}
-		if r.Section != "" {
+		if section, ok := r.ChunkMetadata["section_header"]; ok && section != nil {
 			if s, ok := section.(string); ok && s != "" {
 				if location != "" {
 					location += " / "
 				}
-			location += r.Section
+				location += s
 			}
 		}
 		if location != "" {
 			fmt.Printf("   Location: %s\n", location)
 		}
-		if r.Document.Type != "" {
+		if r.DocType != "" {
-			fmt.Printf("   Type: %s\n", r.Document.Type)
+			fmt.Printf("   Type: %s\n", r.DocType)
 		}
-		if len(r.Document.Tags) > 0 {
+		if len(r.Tags) > 0 {
-			fmt.Printf("   Tags: %s\n", joinStrings(r.Document.Tags))
+			fmt.Printf("   Tags: %s\n", joinStrings(r.Tags))
 		}
 		fmt.Printf("   %s\n", snippet)
 	}
@@ -0,0 +1,36 @@
 FROM ubuntu:24.04
 ENV DEBIAN_FRONTEND=noninteractive
 RUN apt-get update && apt-get install -y --no-install-recommends \
    python3.12 python3.12-venv python3.12-dev python3-pip \
    libpoppler-cpp-dev poppler-utils \
    libgl1 libglib2.0-0 \
    build-essential curl \
    && rm -rf /var/lib/apt/lists/*
 COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
 WORKDIR /app
 COPY pyproject.toml ./
 COPY kb/ kb/
 COPY main.py ./
 COPY VERSION ./
 RUN uv venv .venv && \
    . .venv/bin/activate && \
    uv pip install -e . && \
    uv pip install "sentence-transformers[onnx]" && \
    uv pip install --reinstall torch torchvision --index-url https://download.pytorch.org/whl/cpu
 ENV PATH="/app/.venv/bin:$PATH"
 ENV VIRTUAL_ENV="/app/.venv"
 ENV KB_DEVICE=cpu
 ENV KB_INGEST_DEVICE=cpu
 ENV KB_DATA_DIR=/data
 EXPOSE 8000
 VOLUME ["/data"]
 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
@@ -0,0 +1,17 @@
 services:
  kb-engine:
    build:
      context: .
      dockerfile: Dockerfile.cpu
    ports:
      - "${KB_PORT:-8000}:8000"
    volumes:
      - ${KB_DATA_PATH:-./data}:/data
    environment:
      - KB_MODEL=${KB_MODEL:-all-MiniLM-L6-v2}
      - KB_DEVICE=cpu
      - KB_INGEST_DEVICE=cpu
      - KB_API_KEY=${KB_API_KEY:-}
      - KB_SEARCH_THRESHOLD=${KB_SEARCH_THRESHOLD:-0.01}
      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-}
    restart: unless-stopped
@@ -0,0 +1,2 @@
 schema: spec-driven
 created: 2026-04-02
@@ -0,0 +1,41 @@
 ## Context
 The engine's `/api/v1/search` endpoint returns flat result objects:
 ```json
 {
  "chunk_id": 123,
  "score": 0.031,
  "text": "...",
  "chunk_index": 3,
  "chunk_metadata": {"page": 12, "section_header": "Installation"},
  "title": "Git Admin Guide",
  "doc_type": "pdf",
  "source_path": "/home/user/docs/git-admin.pdf",
  "created_at": "2026-03-15T10:30:00",
  "tags": ["git", "admin"]
 }
 ```
 The Go client's human-mode struct in `client/cmd/search.go` incorrectly expects a nested `document` object and top-level `page`/`section` fields. This causes all metadata to display as zero values.
 ## Goals / Non-Goals
 **Goals:**
 - Fix the search result struct to match the flat engine response
 - Extract `page` and `section_header` from `chunk_metadata` for human display
 - Maintain identical JSON output (already passes through raw response)
 **Non-Goals:**
 - Changing the engine API response format
 - Adding new display fields beyond what was originally intended
 ## Decisions
 **Flatten the struct to match API response.** The result struct will have `Title`, `DocType`, `Tags` as top-level fields (matching `title`, `doc_type`, `tags` JSON keys). `ChunkMetadata` will be decoded as `map[string]interface{}` to extract `page` and `section_header` dynamically, since its contents vary by document type.
 **Why not a typed ChunkMetadata struct?** The metadata keys depend on the ingestion pipeline (PDFs have `page`, markdown has `section_header`, code may have others in future). A map is more resilient to engine-side additions.
 ## Risks / Trade-offs
 - [Minimal risk] If the engine adds new top-level fields, the Go struct silently ignores them — this is existing behavior and acceptable for human-mode display.
@@ -0,0 +1,24 @@
 ## Why
 The Go client's human-mode search output struct expects a nested `document` object and top-level `page`/`section` fields, but the engine API returns flat results with `title`, `doc_type`, `tags` at the result level and `page`/`section_header` inside `chunk_metadata`. This means human-mode display shows empty values for title, type, tags, page, and section.
 ## What Changes
 - Fix the Go client search result struct to match the flat engine API response format
 - Extract `page` and `section_header` from the `chunk_metadata` map instead of expecting them as top-level fields
 - Human-mode output will correctly display document title, type, tags, page number, and section header
 ## Capabilities
 ### New Capabilities
 (none)
 ### Modified Capabilities
 - `go-client`: Fix search result parsing to match actual engine API response shape
 ## Impact
 - `client/cmd/search.go` — struct definition and display logic
 - No API changes, no breaking changes — this is a bug fix aligning the client with the existing API contract
@@ -0,0 +1,40 @@
 ## MODIFIED Requirements
 ### Requirement: Search command
 The client SHALL provide a `kb search <query>` command that sends the query to the engine and displays results.
 #### Scenario: Human-readable search output
 - **WHEN** the user runs `kb search "how to change oil"`
 - **THEN** the client SHALL POST to `/api/v1/search`, and display results in a human-readable format showing rank, score, document title, page/section, doc type, tags, and a text snippet
 - **THEN** the client SHALL parse search results as flat objects with top-level `title`, `doc_type`, `tags`, `score`, `text`, `chunk_index` fields
 - **THEN** the client SHALL extract `page` from `chunk_metadata` when present (PDF documents)
 - **THEN** the client SHALL extract `section_header` from `chunk_metadata` when present (markdown documents)
 #### Scenario: JSON search output
 - **WHEN** the user runs `kb search "query" --format json`
 - **THEN** the client SHALL output the raw JSON response from the engine
 #### Scenario: Search with filters
 - **WHEN** the user runs `kb search "brakes" --tags maintenance --type pdf --top 3`
 - **THEN** the client SHALL include the filters in the API request body
 #### Scenario: Search mode flags
 - **WHEN** the user runs `kb search "error" --fts-only`
 - **THEN** the client SHALL set `fts_only: true` in the request body
 #### Scenario: PDF result with page number
 - **WHEN** a search result has `chunk_metadata` containing `{"page": 12}`
 - **THEN** the human output SHALL display "Page 12" in the location line
 #### Scenario: Markdown result with section header
 - **WHEN** a search result has `chunk_metadata` containing `{"section_header": "Installation > Prerequisites"}`
 - **THEN** the human output SHALL display "Installation > Prerequisites" in the location line
 #### Scenario: Result with both page and section
 - **WHEN** a search result has `chunk_metadata` containing both `page` and `section_header`
 - **THEN** the human output SHALL display both separated by " / "
 #### Scenario: Result with no location metadata
 - **WHEN** a search result has empty `chunk_metadata` or no page/section keys
 - **THEN** the human output SHALL omit the location line entirely
@@ -0,0 +1,14 @@
 ## 1. Fix search result struct
 - [x] 1.1 Replace nested `Document` struct with flat fields (`Title`, `DocType`, `Tags`) matching engine JSON keys
 - [x] 1.2 Add `ChunkMetadata map[string]interface{}` field to capture `chunk_metadata`
 ## 2. Fix display logic
 - [x] 2.1 Update title/type/tags references in the display loop to use the new flat fields
 - [x] 2.2 Extract `page` from `ChunkMetadata` map (replacing top-level `Page` field)
 - [x] 2.3 Extract `section_header` from `ChunkMetadata` map (replacing top-level `Section` field)
 ## 3. Verify
 - [x] 3.1 Build the client and verify it compiles cleanly
@@ -53,6 +53,9 @@ The client SHALL provide a `kb search <query>` command that sends the query to t
 #### Scenario: Human-readable search output
 - **WHEN** the user runs `kb search "how to change oil"`
 - **THEN** the client SHALL POST to `/api/v1/search`, and display results in a human-readable format showing rank, score, document title, page/section, doc type, tags, and a text snippet
 - **THEN** the client SHALL parse search results as flat objects with top-level `title`, `doc_type`, `tags`, `score`, `text`, `chunk_index` fields
 - **THEN** the client SHALL extract `page` from `chunk_metadata` when present (PDF documents)
 - **THEN** the client SHALL extract `section_header` from `chunk_metadata` when present (markdown documents)
 #### Scenario: JSON search output
 - **WHEN** the user runs `kb search "query" --format json`
@@ -66,6 +69,22 @@ The client SHALL provide a `kb search <query>` command that sends the query to t
 - **WHEN** the user runs `kb search "error" --fts-only`
 - **THEN** the client SHALL set `fts_only: true` in the request body
 #### Scenario: PDF result with page number
 - **WHEN** a search result has `chunk_metadata` containing `{"page": 12}`
 - **THEN** the human output SHALL display "Page 12" in the location line
 #### Scenario: Markdown result with section header
 - **WHEN** a search result has `chunk_metadata` containing `{"section_header": "Installation > Prerequisites"}`
 - **THEN** the human output SHALL display "Installation > Prerequisites" in the location line
 #### Scenario: Result with both page and section
 - **WHEN** a search result has `chunk_metadata` containing both `page` and `section_header`
 - **THEN** the human output SHALL display both separated by " / "
 #### Scenario: Result with no location metadata
 - **WHEN** a search result has empty `chunk_metadata` or no page/section keys
 - **THEN** the human output SHALL omit the location line entirely
 ---
 ### Requirement: Add note command
@@ -111,9 +111,11 @@ else
    echo "==> Engine version: $VERSION (no increment)"
 fi
-TAG="engine-v${VERSION}"
+GIT_TAG="engine-v${VERSION}"
 DOCKER_TAG="v${VERSION}"
-echo "    Tag:       $TAG"
+echo "    Git tag:   $GIT_TAG"
 echo "    Image tag: $DOCKER_TAG"
 echo "    Registry:  $IMAGE_BASE"
 echo "    Forge CLI: $FORGE"
 echo "    Dry run:   $DRY_RUN"
@@ -125,8 +127,8 @@ echo ""
 echo "==> Pre-flight checks"
 if [[ "$DRY_RUN" == false ]]; then
-    if git -C "$SCRIPT_DIR" rev-parse "$TAG" &>/dev/null; then
+    if git -C "$SCRIPT_DIR" rev-parse "$GIT_TAG" &>/dev/null; then
-        echo "Error: tag $TAG already exists"
+        echo "Error: tag $GIT_TAG already exists"
        exit 1
    fi
 fi
@@ -148,29 +150,32 @@ fi
 #──────────────────────────────────────────────────────────────────────
 echo "==> Building Docker engine images ($VERSION)"
-NVIDIA_IMAGE="${IMAGE_BASE}/engine:${TAG}-nvidia"
+NVIDIA_IMAGE="${IMAGE_BASE}/engine:${DOCKER_TAG}-nvidia"
-ROCM_IMAGE="${IMAGE_BASE}/engine:${TAG}-rocm"
+ROCM_IMAGE="${IMAGE_BASE}/engine:${DOCKER_TAG}-rocm"
 CPU_IMAGE="${IMAGE_BASE}/engine:${DOCKER_TAG}-cpu"
 NVIDIA_LATEST="${IMAGE_BASE}/engine:latest-nvidia"
 ROCM_LATEST="${IMAGE_BASE}/engine:latest-rocm"
 CPU_LATEST="${IMAGE_BASE}/engine:latest-cpu"
 run docker build -t "$NVIDIA_IMAGE" -t "$NVIDIA_LATEST" -f "$ENGINE_DIR/Dockerfile.nvidia" "$ENGINE_DIR"
 run docker build -t "$ROCM_IMAGE" -t "$ROCM_LATEST" -f "$ENGINE_DIR/Dockerfile.rocm" "$ENGINE_DIR"
 run docker build -t "$CPU_IMAGE" -t "$CPU_LATEST" -f "$ENGINE_DIR/Dockerfile.cpu" "$ENGINE_DIR"
 echo ""
 #──────────────────────────────────────────────────────────────────────
 # 4. Commit, tag, and push
 #──────────────────────────────────────────────────────────────────────
-echo "==> Committing and tagging $TAG"
+echo "==> Committing and tagging $GIT_TAG"
 if [[ "$INCREMENT" == true ]]; then
    run git -C "$SCRIPT_DIR" add "$VERSION_FILE"
    run git -C "$SCRIPT_DIR" commit -m "Bump engine version to $VERSION"
 fi
-run git -C "$SCRIPT_DIR" tag -a "$TAG" -m "Release $TAG"
+run git -C "$SCRIPT_DIR" tag -a "$GIT_TAG" -m "Release $GIT_TAG"
 run git -C "$SCRIPT_DIR" push origin HEAD
-run git -C "$SCRIPT_DIR" push origin "$TAG"
+run git -C "$SCRIPT_DIR" push origin "$GIT_TAG"
 echo ""
@@ -179,7 +184,7 @@ echo ""
 #──────────────────────────────────────────────────────────────────────
 echo "==> Creating release via $FORGE"
-RELEASE_TITLE="Engine $TAG"
+RELEASE_TITLE="Engine $GIT_TAG"
 RELEASE_NOTES="## Docker images
 \`\`\`bash
@@ -188,16 +193,19 @@ docker pull ${NVIDIA_IMAGE}
 # AMD GPU (ROCm)
 docker pull ${ROCM_IMAGE}
 # CPU only
 docker pull ${CPU_IMAGE}
 \`\`\`"
 if [[ "$FORGE" == "gh" ]]; then
-    run gh release create "$TAG" \
+    run gh release create "$GIT_TAG" \
        --title "$RELEASE_TITLE" \
        --notes "$RELEASE_NOTES"
 elif [[ "$FORGE" == "tea" ]]; then
    run tea release create \
-        --tag "$TAG" \
+        --tag "$GIT_TAG" \
        --title "$RELEASE_TITLE" \
        --note "$RELEASE_NOTES"
 fi
@@ -213,10 +221,13 @@ run docker push "$NVIDIA_IMAGE"
 run docker push "$NVIDIA_LATEST"
 run docker push "$ROCM_IMAGE"
 run docker push "$ROCM_LATEST"
 run docker push "$CPU_IMAGE"
 run docker push "$CPU_LATEST"
 echo ""
-echo "==> Release $TAG complete!"
+echo "==> Release $GIT_TAG complete!"
 echo ""
 echo "    Images:"
 echo "      $NVIDIA_IMAGE"
 echo "      $ROCM_IMAGE"
 echo "      $CPU_IMAGE"