Bump engine version to 2.1.0

Reject single bare word as implicit note shorthand
Single unrecognized words now print an error with usage hint instead of being submitted as a note. Prevents typos from creating junk notes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 21:06:04 +01:00 · 2026-03-29 21:03:52 +01:00 · 2026-03-29 21:03:48 +01:00 · 2026-03-29 14:03:35 +01:00 · 2026-03-29 13:58:42 +01:00 · 2026-03-29 13:58:04 +01:00
58 changed files with 2681 additions and 271 deletions
@@ -18,6 +18,73 @@ Go CLI (kb) ──HTTP──▶ FastAPI Engine (Docker) ──▶ SQLite + GPU

 ### 1. Start the engine

+**From pre-built images** (recommended):
+
+```bash
+# NVIDIA GPU
+docker run -d --name kb-engine \
+  --gpus all \
+  -p 8000:8000 \
+  -v ~/kb-data:/data \
+  -e KB_MODEL=all-MiniLM-L6-v2 \
+  -e KB_DEVICE=auto \
+  -e KB_API_KEY=your-secret-key \
+  --restart unless-stopped \
+  docker.dcglab.co.uk/dcg/kb/engine:latest-nvidia
+
+# AMD GPU (ROCm)
+docker run -d --name kb-engine \
+  --device /dev/kfd --device /dev/dri \
+  --group-add video \
+  -p 8000:8000 \
+  -v ~/kb-data:/data \
+  -e KB_MODEL=all-MiniLM-L6-v2 \
+  -e KB_DEVICE=auto \
+  -e KB_API_KEY=your-secret-key \
+  --restart unless-stopped \
+  docker.dcglab.co.uk/dcg/kb/engine:latest-rocm
+```
+
+Or use a compose file — create `compose.yaml`:
+
+```yaml
+services:
+  kb-engine:
+    image: docker.dcglab.co.uk/dcg/kb/engine:latest-nvidia  # or latest-rocm
+    runtime: nvidia  # remove for ROCm
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
+    # For ROCm, replace the above runtime/deploy block with:
+    # devices:
+    #   - "/dev/kfd"
+    #   - "/dev/dri"
+    # group_add:
+    #   - "video"
+    ports:
+      - "${KB_PORT:-8000}:8000"
+    volumes:
+      - ${KB_DATA_PATH:-./data}:/data
+    environment:
+      - KB_MODEL=${KB_MODEL:-all-MiniLM-L6-v2}
+      - KB_DEVICE=${KB_DEVICE:-auto}
+      - KB_INGEST_DEVICE=${KB_INGEST_DEVICE:-auto}
+      - KB_API_KEY=${KB_API_KEY:-}
+      - KB_SEARCH_THRESHOLD=${KB_SEARCH_THRESHOLD:-0.01}
+      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-}
+    restart: unless-stopped
+```
+
+```bash
+KB_DATA_PATH=~/kb-data docker compose up -d
+```
+
+**From source** (for development):
+
 ```bash
 cd engine

@@ -37,17 +104,37 @@ curl http://localhost:8000/api/v1/health

 ### 2. Install the client

-Build from source:
+**From a release** (recommended):
+
+Check [releases](https://gitea.dcglab.co.uk/steve/kb/releases) for the latest client tag, then:
+
+```bash
+# Set the version tag
+TAG=client-v2.1.0
+
+# Linux (amd64)
+curl -L -o kb https://gitea.dcglab.co.uk/steve/kb/releases/download/${TAG}/kb-linux-amd64
+
+# Linux (arm64)
+curl -L -o kb https://gitea.dcglab.co.uk/steve/kb/releases/download/${TAG}/kb-linux-arm64
+
+# macOS (Apple Silicon)
+curl -L -o kb https://gitea.dcglab.co.uk/steve/kb/releases/download/${TAG}/kb-darwin-arm64
+
+# macOS (Intel)
+curl -L -o kb https://gitea.dcglab.co.uk/steve/kb/releases/download/${TAG}/kb-darwin-amd64
+
+# Then install
+chmod +x kb
+sudo mv kb /usr/local/bin/
+```
+
+**From source** (for development):

 ```bash
 cd client
 make build    # produces ./kb binary
-```
-
-Or cross-compile for all platforms:
-
-```bash
-make all      # produces dist/kb-{os}-{arch} binaries
+make all      # or cross-compile: dist/kb-{os}-{arch}
 ```

 ### 3. Configure the client
@@ -65,10 +152,13 @@ Override via environment variables (`KB_ENGINE_URL`, `KB_API_KEY`) or CLI flags
 ### 4. Use it

 ```bash
-# Add documents (async — uploads and exits immediately)
-kb add ~/docs/manual.pdf --tags admin
-kb add ~/notes/ --recursive
-kb add --note "Always restart nginx after config changes" --tags ops
+# Quick notes (shorthand — no subcommand needed)
+kb "Always restart nginx after config changes"
+kb "Server room is building 3, floor 2" --tags ops
+
+# Add files (async — uploads and exits immediately)
+kb addfile ~/docs/manual.pdf --tags admin
+kb addfile ~/notes/ --recursive

 # Check ingestion progress
 kb jobs
@@ -82,6 +172,7 @@ kb list
 kb info 1
 kb tags
 kb tag 1 --add important
+kb export 1 -o manual.pdf    # download original file
 kb remove 3 --yes
 kb status
 ```
@@ -100,12 +191,14 @@ The engine is configured via environment variables (set in the compose file or v
 |---|---|---|
 | `KB_DATA_DIR` | `/data` | Data directory inside the container (bind-mounted) |
 | `KB_MODEL` | `all-MiniLM-L6-v2` | HuggingFace embedding model name |
-| `KB_DEVICE` | `auto` | Embedding device: `auto`, `cpu`, or `cuda` |
-| `KB_INGEST_DEVICE` | `auto` | Docling layout detection device |
+| `KB_DEVICE` | `auto` | Embedding/search device: `auto`, `cpu`, or `cuda` |
+| `KB_INGEST_DEVICE` | `auto` | Docling layout detection device: `auto`, `cpu`, or `cuda` |
 | `KB_API_KEY` | (none) | Optional Bearer token for API authentication |
 | `KB_SEARCH_THRESHOLD` | `0.01` | Minimum score for search results (filters noise) |
 | `KB_PORT` | `8000` | Port to expose |
-| `KB_DATA_PATH` | `./data` | Host path for bind mount (compose variable) |
+| `KB_HOST` | `0.0.0.0` | Host to bind to |
+| `HF_HUB_OFFLINE` | (none) | Set to `1` to prevent model downloads (use cached only) |
+| `KB_DATA_PATH` | `./data` | Host path for bind mount (compose variable, not used by engine) |

 ## Data portability

@@ -134,7 +227,8 @@ All endpoints are under `/api/v1/`. Requires `Authorization: Bearer <key>` heade
 | `GET` | `/jobs/{id}` | Job details |
 | `GET` | `/documents` | List documents |
 | `GET` | `/documents/{id}` | Document details with chunks |
-| `DELETE` | `/documents/{id}` | Remove a document |
+| `GET` | `/documents/{id}/file` | Download original file |
+| `DELETE` | `/documents/{id}` | Remove a document (and stored file) |
 | `PUT` | `/documents/{id}/tags` | Add/remove tags |
 | `GET` | `/tags` | List all tags |
 | `GET` | `/status` | Engine status, GPU info, DB stats |
@@ -142,26 +236,31 @@ All endpoints are under `/api/v1/`. Requires `Authorization: Bearer <key>` heade

 ## Building and releasing

-Versioning is managed via `client/VERSION` and `engine/VERSION` files. The release script bumps these, builds all artifacts, tags, and publishes in one step.
+Client and engine are versioned independently via `client/VERSION` and `engine/VERSION`. Each has its own release script and git tag prefix.

-### Release
+### Release client

 ```bash
-./release.sh --gitea              # patch bump (e.g. 2.0.0 → 2.0.1), release via Gitea
-./release.sh --github --minor     # minor bump (e.g. 2.0.1 → 2.1.0), release via GitHub
-./release.sh --gitea --major      # major bump (e.g. 2.1.0 → 3.0.0)
-./release.sh --gitea --no-increment  # release current version as-is
-./release.sh --gitea --dry-run    # preview without doing anything
+./release-client.sh --gitea              # patch bump, release via Gitea
+./release-client.sh --github --minor     # minor bump, release via GitHub
+./release-client.sh --gitea --no-increment  # release current version as-is
+./release-client.sh --gitea --dry-run    # preview without doing anything
 ```

-The script will:
+Creates tag `client-vX.Y.Z`, builds Go binaries for all platforms, and creates a Gitea/GitHub release with binaries attached.

-1. Bump the version in both `client/VERSION` and `engine/VERSION` (unless `--no-increment`)
-2. Build Go client binaries for all platforms (linux/darwin/windows, amd64/arm64)
-3. Build Docker engine images for NVIDIA and ROCm
-4. Commit the version bump, create an annotated git tag, and push
-5. Create a release (with client binaries attached) via `tea` or `gh`
-6. Push Docker images to the registry
+The client embeds a `MinEngineVersion` (from `client/MIN_ENGINE_VERSION`) and will hard-fail if the connected engine is too old.
+
+### Release engine
+
+```bash
+./release-engine.sh --gitea              # patch bump, release via Gitea
+./release-engine.sh --github --minor     # minor bump, release via GitHub
+./release-engine.sh --gitea --no-increment  # release current version as-is
+./release-engine.sh --gitea --dry-run    # preview without doing anything
+```
+
+Creates tag `engine-vX.Y.Z`, builds NVIDIA and ROCm Docker images, creates a Gitea/GitHub release, and pushes images to the registry.

 ### Checking versions

@@ -177,13 +276,13 @@ curl http://localhost:8000/api/v1/status | jq .version

 Images are pushed to `docker.dcglab.co.uk/dcg/kb/engine` with tags:

- `v2.1.0-nvidia` / `v2.1.0-rocm` — versioned
+- `engine-v2.0.6-nvidia` / `engine-v2.0.6-rocm` — versioned
 - `latest-nvidia` / `latest-rocm` — latest release

 Override the registry and org via environment variables:

 ```bash
-REGISTRY=ghcr.io IMAGE_ORG=myorg ./release.sh --github
+REGISTRY=ghcr.io IMAGE_ORG=myorg ./release-engine.sh --github
 ```

 ## Future: ROCm runtime migration
@@ -1,6 +1,6 @@
 # kb-search skill

-Search the user's personal knowledge base containing PDFs, markdown documents, code snippets, and text notes.
+Search, manage, and add to the user's personal knowledge base containing PDFs, Word docs, HTML, markdown, code files, and text notes.

 ## When to use

@@ -8,10 +8,18 @@ Search the user's personal knowledge base containing PDFs, markdown documents, c
 - User explicitly says "check my notes", "search kb", "look in my knowledge base", "what do my docs say about..."
 - User references documents or notes they've previously stored
 - User asks "how do I..." style questions that their knowledge base likely covers
+- User wants to save a note, add a file, or manage their knowledge base

-## Available commands
+## Quick notes

-### Search (primary)
+```bash
+kb "remember to update DNS records"                # add a note
+kb "server room is building 3, floor 2" --tags ops # add a tagged note
+```
+
+Bare text without a subcommand is treated as a note and submitted for ingestion.
+
+## Search (primary use case)

 ```bash
 kb search "<query>" --top 10 --format json
@@ -20,25 +28,72 @@ kb search "<query>" --top 10 --format json
 Returns JSON with ranked results combining full-text and semantic search.

 **Flags:**
- `--top N` — number of results (default: 10)
+- `-n, --top N` — number of results (default: 10)
 - `--tags tag1,tag2` — filter by tags (AND logic)
 - `--type pdf|markdown|code|note` — filter by document type
- `--format json|human` — output format (always use json)
+- `--format json|human` — output format (always use json for parsing)
 - `--fts-only` — keyword search only (skip semantic)
 - `--vec-only` — semantic search only (skip keyword)
 - `--threshold FLOAT` — minimum score cutoff

-### Other useful commands
+## Adding files

 ```bash
-kb list --format json                    # List all documents
-kb list --type pdf --format json         # List only PDFs
-kb tags --format json                    # List tags with counts
-kb info <doc_id> --format json           # Document details
-kb status --format json                  # DB stats
+kb addfile report.pdf                           # single file
+kb addfile report.pdf --tags admin,reference    # with tags
+kb addfile ~/docs/ --recursive                  # directory (recursive)
+kb addfile ~/docs/ --recursive --tags reference # directory with tags
 ```

-## Output format (search)
+Supported file types: `.pdf`, `.docx`, `.html`, `.md`, `.txt`, `.py`, `.sh`, `.go`. Unsupported extensions are rejected before upload.
+
+**Flags:**
+- `--tags tag1,tag2` — tags (comma-separated)
+- `-r, --recursive` — recursively add directory contents
+
+## Document management
+
+```bash
+kb list --format json                    # list all documents
+kb list --type pdf --format json         # filter by type
+kb list --tags admin --format json       # filter by tags
+kb info <doc_id> --format json           # document details with chunks
+kb export <doc_id> -o file.pdf           # download original file
+kb remove <doc_id>                       # remove (prompts for confirmation)
+kb remove <doc_id> --yes                 # remove without confirmation
+```
+
+## Tag management
+
+```bash
+kb tags --format json                    # list all tags with counts
+kb tag <doc_id> --add important,ops      # add tags to a document
+kb tag <doc_id> --remove draft           # remove tags from a document
+```
+
+## Jobs (ingestion queue)
+
+```bash
+kb jobs --format json                    # list recent jobs
+kb jobs --status failed --format json    # filter by status
+kb jobs <job_id> --format json           # job details
+```
+
+## Engine status and maintenance
+
+```bash
+kb status --format json                  # engine status, GPU info, DB stats
+kb reindex --yes                         # re-embed all chunks (skip confirmation)
+```
+
+## Global flags
+
+All commands support:
+- `--format json|human` — output format (always use `json` for machine parsing)
+- `--engine <url>` — engine API URL (default: http://localhost:8000)
+- `--api-key <key>` — API key for authentication
+
+## Search output format

 ```json
 {
@@ -66,7 +121,7 @@ kb status --format json                  # DB stats
 }
 ```

-## How to answer
+## How to answer search queries

 1. Run `kb search "<query>" --top 10 --format json`
 2. Read the returned chunks
@@ -93,7 +148,7 @@ Query 2: kb search "git merge explanation" --top 5 --format json
 Query 3: kb search "git rebase vs merge" --top 5 --format json
 ```

-## Filtering
+## Filtering tips

 Use filters when the question implies a specific domain:

@@ -108,3 +163,4 @@ Use filters when the question implies a specific domain:
 - `source.page` is only present for PDF documents
 - `source.section_header` is only present for markdown documents with headers
 - Results are already ranked by relevance (hybrid FTS + vector search)
+- Duplicate files are detected at upload time (HTTP 409) — the client handles this gracefully
@@ -0,0 +1 @@
+2.0.0
@@ -1,5 +1,6 @@
 VERSION ?= $(shell cat VERSION 2>/dev/null || echo "dev")
-LDFLAGS := -ldflags "-s -w -X github.com/kb-search/kb/cmd.Version=$(VERSION)"
+MIN_ENGINE_VERSION ?= $(shell cat MIN_ENGINE_VERSION 2>/dev/null || echo "dev")
+LDFLAGS := -ldflags "-s -w -X github.com/kb-search/kb/cmd.Version=$(VERSION) -X github.com/kb-search/kb/cmd.MinEngineVersion=$(MIN_ENGINE_VERSION)"

 PLATFORMS := linux/amd64 linux/arm64 darwin/amd64 darwin/arm64 windows/amd64

@@ -1 +1 @@
-2.0.4
+2.1.0
@@ -6,6 +6,7 @@ import (
 	"net/http"
 	"os"
 	"path/filepath"
+	"sort"
 	"strings"

 	"github.com/kb-search/kb/internal/api"
@@ -39,93 +40,25 @@ var supportedExts = map[string]bool{
 	".go":   true,
 }

-var addCmd = &cobra.Command{
-	Use:   "add <path>",
-	Short: "Add a document or directory to the knowledge base",
-	Args:  cobra.MaximumNArgs(1),
-	RunE:  runAdd,
+var addfileCmd = &cobra.Command{
+	Use:   "addfile <path>",
+	Short: "Upload a file or directory to the knowledge base",
+	Args:  cobra.ExactArgs(1),
+	RunE:  runAddfile,
 }

 func init() {
-	addCmd.Flags().String("tags", "", "tags (comma-separated)")
-	addCmd.Flags().String("type", "", "document type")
-	addCmd.Flags().BoolP("recursive", "r", false, "recursively add directory contents")
-	addCmd.Flags().String("note", "", "add a text note instead of a file")
-	addCmd.Flags().String("title", "", "title for the note")
-	rootCmd.AddCommand(addCmd)
+	addfileCmd.Flags().String("tags", "", "tags (comma-separated)")
+	addfileCmd.Flags().BoolP("recursive", "r", false, "recursively add directory contents")
+	rootCmd.AddCommand(addfileCmd)
 }

-func runAdd(cmd *cobra.Command, args []string) error {
+func runAddfile(cmd *cobra.Command, args []string) error {
 	tags, _ := cmd.Flags().GetString("tags")
-	docType, _ := cmd.Flags().GetString("type")
 	recursive, _ := cmd.Flags().GetBool("recursive")
-	note, _ := cmd.Flags().GetString("note")
-	title, _ := cmd.Flags().GetString("title")

 	client := api.NewClient()

-	// Note mode
-	if note != "" {
-		fields := map[string]string{
-			"note": note,
-		}
-		if title != "" {
-			fields["title"] = title
-		}
-		if tags != "" {
-			fields["tags"] = tags
-		}
-		if docType != "" {
-			fields["type"] = docType
-		}
-
-		resp, err := client.PostMultipart("/api/v1/jobs", fields, nil)
-		if err != nil {
-			fmt.Fprintln(os.Stderr, err)
-			os.Exit(1)
-		}
-
-		if resp.StatusCode == http.StatusConflict {
-			var result interface{}
-			if err := api.DecodeJSON(resp, &result); err != nil {
-				return fmt.Errorf("failed to decode response: %w", err)
-			}
-			if output.IsJSON() {
-				output.PrintJSON(result)
-			} else {
-				if m, ok := result.(map[string]interface{}); ok {
-					if docID, ok := m["document_id"].(float64); ok {
-						fmt.Printf("Already imported: %s (doc ID: %.0f)\n", m["title"], docID)
-					} else if jobID, ok := m["job_id"].(float64); ok {
-						fmt.Printf("Already queued: %s (job ID: %.0f)\n", m["title"], jobID)
-					}
-				}
-			}
-			return nil
-		}
-
-		if err := api.CheckError(resp); err != nil {
-			fmt.Fprintln(os.Stderr, err)
-			os.Exit(1)
-		}
-
-		var result interface{}
-		if err := api.DecodeJSON(resp, &result); err != nil {
-			return fmt.Errorf("failed to decode response: %w", err)
-		}
-
-		if output.IsJSON() {
-			output.PrintJSON(result)
-		} else {
-			fmt.Println("Queued: note")
-		}
-		return nil
-	}
-
-	if len(args) == 0 {
-		return fmt.Errorf("path argument is required (or use --note)")
-	}
-
 	path := args[0]
 	info, err := os.Stat(path)
 	if err != nil {
@@ -133,8 +66,19 @@ func runAdd(cmd *cobra.Command, args []string) error {
 	}

 	if !info.IsDir() {
+		// Validate extension
+		ext := strings.ToLower(filepath.Ext(path))
+		if !supportedExts[ext] {
+			supported := make([]string, 0, len(supportedExts))
+			for e := range supportedExts {
+				supported = append(supported, e)
+			}
+			sort.Strings(supported)
+			return fmt.Errorf("unsupported file type %q — supported: %s", ext, strings.Join(supported, ", "))
+		}
+
 		// Single file upload
-		result, err := uploadFile(client, path, tags, docType)
+		result, err := uploadFile(client, path, tags)
 		if err != nil {
 			fmt.Fprintln(os.Stderr, err)
 			os.Exit(1)
@@ -177,7 +121,7 @@ func runAdd(cmd *cobra.Command, args []string) error {
 	queued := 0
 	duplicates := 0
 	for _, f := range files {
-		result, err := uploadFile(client, f, tags, docType)
+		result, err := uploadFile(client, f, tags)
 		if err != nil {
 			fmt.Fprintf(os.Stderr, "Error uploading %s: %v\n", f, err)
 			continue
@@ -206,7 +150,7 @@ func runAdd(cmd *cobra.Command, args []string) error {
 	return nil
 }

-func uploadFile(client *api.Client, path, tags, docType string) (*uploadResult, error) {
+func uploadFile(client *api.Client, path, tags string) (*uploadResult, error) {
 	f, err := os.Open(path)
 	if err != nil {
 		return nil, fmt.Errorf("cannot open %s: %w", path, err)
@@ -217,9 +161,6 @@ func uploadFile(client *api.Client, path, tags, docType string) (*uploadResult,
 	if tags != "" {
 		fields["tags"] = tags
 	}
-	if docType != "" {
-		fields["type"] = docType
-	}

 	upload := &api.FileUpload{
 		FieldName: "file",
@@ -264,3 +205,54 @@ func uploadFile(client *api.Client, path, tags, docType string) (*uploadResult,
 	}
 	return &uploadResult{Raw: result}, nil
 }
+
+func submitNote(client *api.Client, note, tags string) error {
+	fields := map[string]string{
+		"note": note,
+	}
+	if tags != "" {
+		fields["tags"] = tags
+	}
+
+	resp, err := client.PostMultipart("/api/v1/jobs", fields, nil)
+	if err != nil {
+		fmt.Fprintln(os.Stderr, err)
+		os.Exit(1)
+	}
+
+	if resp.StatusCode == http.StatusConflict {
+		var result interface{}
+		if err := api.DecodeJSON(resp, &result); err != nil {
+			return fmt.Errorf("failed to decode response: %w", err)
+		}
+		if output.IsJSON() {
+			output.PrintJSON(result)
+		} else {
+			if m, ok := result.(map[string]interface{}); ok {
+				if docID, ok := m["document_id"].(float64); ok {
+					fmt.Printf("Already imported: %s (doc ID: %.0f)\n", m["title"], docID)
+				} else if jobID, ok := m["job_id"].(float64); ok {
+					fmt.Printf("Already queued: %s (job ID: %.0f)\n", m["title"], jobID)
+				}
+			}
+		}
+		return nil
+	}
+
+	if err := api.CheckError(resp); err != nil {
+		fmt.Fprintln(os.Stderr, err)
+		os.Exit(1)
+	}
+
+	var result interface{}
+	if err := api.DecodeJSON(resp, &result); err != nil {
+		return fmt.Errorf("failed to decode response: %w", err)
+	}
+
+	if output.IsJSON() {
+		output.PrintJSON(result)
+	} else {
+		fmt.Println("Queued: note")
+	}
+	return nil
+}
@@ -0,0 +1,37 @@
+package cmd
+
+import (
+	"fmt"
+
+	"github.com/spf13/cobra"
+)
+
+var examplesCmd = &cobra.Command{
+	Use:   "examples",
+	Short: "Show common usage examples",
+	Args:  cobra.NoArgs,
+	Run: func(cmd *cobra.Command, args []string) {
+		fmt.Print(`Quick notes:
+  kb "Remember to update DNS records"
+  kb "Server room is building 3" --tags ops
+
+Add files:
+  kb addfile report.pdf
+  kb addfile ~/docs/ --recursive --tags reference
+
+Search:
+  kb search "how to restart nginx"
+  kb search "deploy" --tags ops --top 5
+
+Manage documents:
+  kb list --type pdf
+  kb info 3
+  kb tag 3 --add important,ops
+  kb remove 3 --yes
+`)
+	},
+}
+
+func init() {
+	rootCmd.AddCommand(examplesCmd)
+}
@@ -0,0 +1,74 @@
+package cmd
+
+import (
+	"fmt"
+	"io"
+	"mime"
+	"os"
+	"path/filepath"
+
+	"github.com/kb-search/kb/internal/api"
+	"github.com/spf13/cobra"
+)
+
+var exportCmd = &cobra.Command{
+	Use:   "export <id>",
+	Short: "Download original document file",
+	Args:  cobra.ExactArgs(1),
+	RunE:  runExport,
+}
+
+func init() {
+	exportCmd.Flags().StringP("output", "o", "", "output file path (default: original filename to current directory)")
+	rootCmd.AddCommand(exportCmd)
+}
+
+func runExport(cmd *cobra.Command, args []string) error {
+	client := api.NewClient()
+	resp, err := client.Get("/api/v1/documents/" + args[0] + "/file")
+	if err != nil {
+		fmt.Fprintln(os.Stderr, err)
+		os.Exit(1)
+	}
+	if err := api.CheckError(resp); err != nil {
+		fmt.Fprintln(os.Stderr, err)
+		os.Exit(1)
+	}
+	defer resp.Body.Close()
+
+	outPath, _ := cmd.Flags().GetString("output")
+
+	if outPath == "" {
+		// Try to get filename from Content-Disposition header
+		cd := resp.Header.Get("Content-Disposition")
+		if cd != "" {
+			_, params, err := mime.ParseMediaType(cd)
+			if err == nil && params["filename"] != "" {
+				outPath = params["filename"]
+			}
+		}
+		if outPath == "" {
+			outPath = "document-" + args[0]
+		}
+	}
+
+	if outPath == "-" {
+		_, err := io.Copy(os.Stdout, resp.Body)
+		return err
+	}
+
+	outPath = filepath.Clean(outPath)
+	f, err := os.Create(outPath)
+	if err != nil {
+		return fmt.Errorf("failed to create output file: %w", err)
+	}
+	defer f.Close()
+
+	n, err := io.Copy(f, resp.Body)
+	if err != nil {
+		return fmt.Errorf("failed to write file: %w", err)
+	}
+
+	fmt.Fprintf(os.Stderr, "Saved %s (%d bytes)\n", outPath, n)
+	return nil
+}
@@ -0,0 +1,83 @@
+package cmd
+
+import (
+	"bufio"
+	"fmt"
+	"os"
+	"strings"
+
+	"github.com/kb-search/kb/internal/api"
+	"github.com/kb-search/kb/internal/output"
+	"github.com/spf13/cobra"
+)
+
+var reindexCmd = &cobra.Command{
+	Use:   "reindex",
+	Short: "Re-embed all chunks with the current engine model",
+	Args:  cobra.NoArgs,
+	RunE:  runReindex,
+}
+
+func init() {
+	reindexCmd.Flags().BoolP("yes", "y", false, "skip confirmation prompt")
+	rootCmd.AddCommand(reindexCmd)
+}
+
+func runReindex(cmd *cobra.Command, args []string) error {
+	yes, _ := cmd.Flags().GetBool("yes")
+
+	client := api.NewClient()
+
+	if !yes {
+		// Fetch model name from engine status
+		modelName := "current"
+		statusResp, err := client.Get("/api/v1/status")
+		if err == nil && api.CheckError(statusResp) == nil {
+			var status struct {
+				ModelName string `json:"model_name"`
+			}
+			if api.DecodeJSON(statusResp, &status) == nil && status.ModelName != "" {
+				modelName = status.ModelName
+			}
+		}
+
+		fmt.Printf("Reindex all chunks? This will re-embed everything with the %s model. [y/N] ", modelName)
+		reader := bufio.NewReader(os.Stdin)
+		answer, _ := reader.ReadString('\n')
+		answer = strings.TrimSpace(strings.ToLower(answer))
+		if answer != "y" && answer != "yes" {
+			fmt.Println("Cancelled.")
+			return nil
+		}
+	}
+	resp, err := client.Post("/api/v1/reindex", nil)
+	if err != nil {
+		fmt.Fprintln(os.Stderr, err)
+		os.Exit(1)
+	}
+	if err := api.CheckError(resp); err != nil {
+		fmt.Fprintln(os.Stderr, err)
+		os.Exit(1)
+	}
+
+	var result struct {
+		ChunksReindexed int    `json:"chunks_reindexed"`
+		Model           string `json:"model"`
+	}
+
+	if output.IsJSON() {
+		var raw interface{}
+		if err := api.DecodeJSON(resp, &raw); err != nil {
+			fmt.Fprintln(os.Stderr, "Failed to parse response:", err)
+			os.Exit(1)
+		}
+		output.PrintJSON(raw)
+	} else {
+		if err := api.DecodeJSON(resp, &result); err != nil {
+			fmt.Fprintln(os.Stderr, "Failed to parse response:", err)
+			os.Exit(1)
+		}
+		fmt.Printf("Reindexed %d chunks (model: %s)\n", result.ChunksReindexed, result.Model)
+	}
+	return nil
+}
@@ -3,7 +3,9 @@ package cmd
 import (
 	"fmt"
 	"os"
+	"strings"

+	"github.com/kb-search/kb/internal/api"
 	"github.com/kb-search/kb/internal/config"
 	"github.com/spf13/cobra"
 )
@@ -11,6 +13,9 @@ import (
 // Version is set at build time via -ldflags.
 var Version = "dev"

+// MinEngineVersion is set at build time via -ldflags.
+var MinEngineVersion = "dev"
+
 var (
 	flagEngine string
 	flagFormat string
@@ -18,9 +23,10 @@ var (
 )

 var rootCmd = &cobra.Command{
-	Use:   "kb",
+	Use:   "kb [\"note text\" | command]",
 	Short: "kb-search CLI client",
-	Long:  "A CLI client for the kb-search v2 engine API.",
+	Long:  "A CLI client for the kb-search v2 engine API.\nRun 'kb examples' for common usage patterns.",
+	Args: cobra.ArbitraryArgs,
 	PersistentPreRunE: func(cmd *cobra.Command, args []string) error {
 		if err := config.Load(); err != nil {
 			return err
@@ -28,13 +34,44 @@ var rootCmd = &cobra.Command{
 		config.ApplyFlags(flagEngine, flagFormat, flagAPIKey)
 		return nil
 	},
+	RunE: func(cmd *cobra.Command, args []string) error {
+		if len(args) == 0 {
+			return cmd.Help()
+		}
+		if len(args) == 1 {
+			return fmt.Errorf("unknown command %q\nTo add a note, use: kb \"%s ...\" or pass multiple words", args[0], args[0])
+		}
+		note := strings.Join(args, " ")
+		tags, _ := cmd.Flags().GetString("tags")
+		client := api.NewClient()
+		return submitNote(client, note, tags)
+	},
 }

 func init() {
+	api.SetVersionInfo(Version, MinEngineVersion)
 	rootCmd.Version = Version
+	rootCmd.SetUsageTemplate(`Quick note taking (must be more than one word):
+  kb "note text here" [flags]
+
+Normal usage:
+  kb [command] [flags]{{if .HasAvailableSubCommands}}
+
+Available Commands:{{range .Commands}}{{if (or .IsAvailableCommand (eq .Name "help"))}}
+  {{rpad .Name .NamePadding }} {{.Short}}{{end}}{{end}}{{end}}{{if .HasAvailableLocalFlags}}
+
+Flags:
+{{.LocalFlags.FlagUsages | trimTrailingWhitespaces}}{{end}}{{if .HasAvailableInheritedFlags}}
+
+Global Flags:
+{{.InheritedFlags.FlagUsages | trimTrailingWhitespaces}}{{end}}
+
+Use "{{.CommandPath}} [command] --help" for more information about a command.
+`)
 	rootCmd.PersistentFlags().StringVar(&flagEngine, "engine", "", "engine API URL")
 	rootCmd.PersistentFlags().StringVar(&flagFormat, "format", "", "output format (human|json)")
 	rootCmd.PersistentFlags().StringVar(&flagAPIKey, "api-key", "", "API key for authentication")
+	rootCmd.Flags().String("tags", "", "tags for note shorthand (comma-separated)")
 }

 // Execute runs the root command.
@@ -0,0 +1,54 @@
+package cmd
+
+import (
+	"bytes"
+	"strings"
+	"testing"
+)
+
+func TestRootCmd_SingleWordRejected(t *testing.T) {
+	rootCmd.SetArgs([]string{"infow"})
+
+	var stderr bytes.Buffer
+	rootCmd.SetErr(&stderr)
+
+	err := rootCmd.Execute()
+	if err == nil {
+		t.Fatal("expected error for single bare word, got nil")
+	}
+
+	errMsg := err.Error()
+	if !strings.Contains(errMsg, `unknown command "infow"`) {
+		t.Errorf("expected error to mention unknown command, got: %s", errMsg)
+	}
+	if !strings.Contains(errMsg, "multiple words") {
+		t.Errorf("expected error to suggest multiple words, got: %s", errMsg)
+	}
+}
+
+func TestRootCmd_MultipleWordsNotRejected(t *testing.T) {
+	rootCmd.SetArgs([]string{"remember", "to", "update", "dns"})
+
+	err := rootCmd.Execute()
+	// Will fail at API call (no server), but should NOT be the "unknown command" error
+	if err != nil && strings.Contains(err.Error(), "unknown command") {
+		t.Errorf("multi-word input should not be rejected as unknown command, got: %s", err.Error())
+	}
+}
+
+func TestRootCmd_NoArgs_ShowsHelp(t *testing.T) {
+	rootCmd.SetArgs([]string{})
+
+	var stdout bytes.Buffer
+	rootCmd.SetOut(&stdout)
+
+	err := rootCmd.Execute()
+	if err != nil {
+		t.Fatalf("expected no error for zero args, got: %v", err)
+	}
+
+	output := stdout.String()
+	if !strings.Contains(output, "Available Commands") {
+		t.Errorf("expected help output, got: %s", output)
+	}
+}
@@ -7,6 +7,9 @@ import (
 	"io"
 	"mime/multipart"
 	"net/http"
+	"os"
+	"strconv"
+	"strings"

 	"github.com/kb-search/kb/internal/config"
 )
@@ -18,11 +21,25 @@ type FileUpload struct {
 	Reader    io.Reader
 }

+// Package-level version info, set once by cmd.init via SetVersionInfo.
+var (
+	clientVersion    string
+	minEngineVersion string
+)
+
+// SetVersionInfo configures the client and minimum engine version for compatibility checking.
+// Called once from cmd package initialization.
+func SetVersionInfo(cv, minEV string) {
+	clientVersion = cv
+	minEngineVersion = minEV
+}
+
 // Client is an HTTP client for the kb-search engine API.
 type Client struct {
-	baseURL    string
-	apiKey     string
-	httpClient *http.Client
+	baseURL        string
+	apiKey         string
+	httpClient     *http.Client
+	versionChecked bool
 }

 // NewClient creates a Client from the current configuration.
@@ -48,6 +65,7 @@ func (c *Client) newRequest(method, path string, body io.Reader) (*http.Request,
 }

 func (c *Client) do(req *http.Request) (*http.Response, error) {
+	c.checkEngineVersion()
 	resp, err := c.httpClient.Do(req)
 	if err != nil {
 		return nil, fmt.Errorf("Cannot reach engine at %s: %v", c.baseURL, err)
@@ -55,6 +73,71 @@ func (c *Client) do(req *http.Request) (*http.Response, error) {
 	return resp, nil
 }

+func (c *Client) checkEngineVersion() {
+	if c.versionChecked {
+		return
+	}
+	c.versionChecked = true
+
+	minVer := minEngineVersion
+	if minVer == "" || minVer == "dev" {
+		return
+	}
+
+	statusReq, err := c.newRequest(http.MethodGet, "/api/v1/status", nil)
+	if err != nil {
+		return
+	}
+	resp, err := c.httpClient.Do(statusReq)
+	if err != nil {
+		return // unreachable — let the actual request surface the error
+	}
+	defer resp.Body.Close()
+
+	var status struct {
+		Version string `json:"version"`
+	}
+	if err := json.NewDecoder(resp.Body).Decode(&status); err != nil {
+		return
+	}
+
+	if !semverAtLeast(status.Version, minVer) {
+		fmt.Fprintf(os.Stderr, "Error: kb client v%s requires engine v%s+ (connected engine is v%s)\nUpdate your engine image to engine-v%s or later.\n",
+			clientVersion, minVer, status.Version, minVer)
+		os.Exit(1)
+	}
+}
+
+// semverAtLeast returns true if version >= minimum, comparing major.minor.patch.
+func semverAtLeast(version, minimum string) bool {
+	parse := func(s string) (int, int, int) {
+		s = strings.TrimPrefix(s, "v")
+		parts := strings.SplitN(s, ".", 3)
+		var major, minor, patch int
+		if len(parts) >= 1 {
+			major, _ = strconv.Atoi(parts[0])
+		}
+		if len(parts) >= 2 {
+			minor, _ = strconv.Atoi(parts[1])
+		}
+		if len(parts) >= 3 {
+			patch, _ = strconv.Atoi(parts[2])
+		}
+		return major, minor, patch
+	}
+
+	vMaj, vMin, vPat := parse(version)
+	mMaj, mMin, mPat := parse(minimum)
+
+	if vMaj != mMaj {
+		return vMaj > mMaj
+	}
+	if vMin != mMin {
+		return vMin > mMin
+	}
+	return vPat >= mPat
+}
+
 // Get performs a GET request to the given path.
 func (c *Client) Get(path string) (*http.Response, error) {
 	req, err := c.newRequest(http.MethodGet, path, nil)
@@ -0,0 +1,136 @@
+package api
+
+import (
+	"encoding/json"
+	"net/http"
+	"net/http/httptest"
+	"testing"
+)
+
+func TestSemverAtLeast(t *testing.T) {
+	tests := []struct {
+		version  string
+		minimum  string
+		expected bool
+	}{
+		{"2.1.0", "2.0.0", true},
+		{"2.0.0", "2.0.0", true},
+		{"2.0.5", "2.0.0", true},
+		{"2.1.5", "2.1.0", true},
+		{"2.0.9", "2.1.0", false},
+		{"1.9.9", "2.0.0", false},
+		{"3.0.0", "2.9.9", true},
+		{"2.0.0", "2.0.1", false},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.version+">="+tt.minimum, func(t *testing.T) {
+			got := semverAtLeast(tt.version, tt.minimum)
+			if got != tt.expected {
+				t.Errorf("semverAtLeast(%q, %q) = %v, want %v", tt.version, tt.minimum, got, tt.expected)
+			}
+		})
+	}
+}
+
+func TestCheckEngineVersion_Compatible(t *testing.T) {
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		json.NewEncoder(w).Encode(map[string]string{"version": "2.1.0"})
+	}))
+	defer srv.Close()
+
+	clientVersion = "2.2.0"
+	minEngineVersion = "2.1.0"
+	defer func() { clientVersion = ""; minEngineVersion = "" }()
+
+	c := &Client{
+		baseURL:    srv.URL,
+		httpClient: &http.Client{},
+	}
+
+	// Should not panic or exit
+	c.checkEngineVersion()
+
+	if !c.versionChecked {
+		t.Error("versionChecked should be true after check")
+	}
+}
+
+func TestCheckEngineVersion_SkipsWhenDev(t *testing.T) {
+	clientVersion = "dev"
+	minEngineVersion = "dev"
+	defer func() { clientVersion = ""; minEngineVersion = "" }()
+
+	c := &Client{
+		baseURL:    "http://localhost:99999",
+		httpClient: &http.Client{},
+	}
+
+	// Should not attempt connection
+	c.checkEngineVersion()
+
+	if !c.versionChecked {
+		t.Error("versionChecked should be true after skipping")
+	}
+}
+
+func TestCheckEngineVersion_SkipsWhenEmpty(t *testing.T) {
+	clientVersion = "1.0.0"
+	minEngineVersion = ""
+	defer func() { clientVersion = ""; minEngineVersion = "" }()
+
+	c := &Client{
+		baseURL:    "http://localhost:99999",
+		httpClient: &http.Client{},
+	}
+
+	c.checkEngineVersion()
+
+	if !c.versionChecked {
+		t.Error("versionChecked should be true after skipping")
+	}
+}
+
+func TestCheckEngineVersion_SkipsWhenUnreachable(t *testing.T) {
+	clientVersion = "2.0.0"
+	minEngineVersion = "2.0.0"
+	defer func() { clientVersion = ""; minEngineVersion = "" }()
+
+	c := &Client{
+		baseURL:    "http://localhost:99999",
+		httpClient: &http.Client{},
+	}
+
+	// Should not panic — just skip
+	c.checkEngineVersion()
+
+	if !c.versionChecked {
+		t.Error("versionChecked should be true even when unreachable")
+	}
+}
+
+func TestCheckEngineVersion_CachedAfterFirstCall(t *testing.T) {
+	callCount := 0
+	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		callCount++
+		json.NewEncoder(w).Encode(map[string]string{"version": "2.1.0"})
+	}))
+	defer srv.Close()
+
+	clientVersion = "2.1.0"
+	minEngineVersion = "2.0.0"
+	defer func() { clientVersion = ""; minEngineVersion = "" }()
+
+	c := &Client{
+		baseURL:    srv.URL,
+		httpClient: &http.Client{},
+	}
+
+	c.checkEngineVersion()
+	c.checkEngineVersion()
+	c.checkEngineVersion()
+
+	if callCount != 1 {
+		t.Errorf("expected 1 status call, got %d", callCount)
+	}
+}
@@ -1 +1 @@
-2.0.4
+2.1.0
@@ -21,4 +21,5 @@ services:
      - KB_INGEST_DEVICE=${KB_INGEST_DEVICE:-auto}
      - KB_API_KEY=${KB_API_KEY:-}
      - KB_SEARCH_THRESHOLD=${KB_SEARCH_THRESHOLD:-0.01}
+      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-}
    restart: unless-stopped
@@ -18,4 +18,5 @@ services:
      - KB_INGEST_DEVICE=${KB_INGEST_DEVICE:-auto}
      - KB_API_KEY=${KB_API_KEY:-}
      - KB_SEARCH_THRESHOLD=${KB_SEARCH_THRESHOLD:-0.01}
+      - HF_HUB_OFFLINE=${HF_HUB_OFFLINE:-}
    restart: unless-stopped
@@ -35,10 +35,15 @@ class Config:
    def staging_dir(self) -> Path:
        return self.data_dir / "staging"

+    @property
+    def documents_dir(self) -> Path:
+        return self.data_dir / "documents"
+
    def ensure_dirs(self):
        self.data_dir.mkdir(parents=True, exist_ok=True)
        self.hf_cache.mkdir(exist_ok=True)
        self.staging_dir.mkdir(exist_ok=True)
+        self.documents_dir.mkdir(exist_ok=True)


 cfg = Config()
@@ -10,6 +10,60 @@ import struct
 from typing import Any, Optional


+def build_enriched_text(title: str, chunk_text: str, metadata: dict | None = None) -> str:
+    """Build enriched text by prepending document title and optional section header.
+
+    Format: "{title} > {section_header}\\n\\n{chunk_text}" or "{title}\\n\\n{chunk_text}".
+    """
+    section_header = (metadata or {}).get("section_header")
+    if section_header:
+        return f"{title} > {section_header}\n\n{chunk_text}"
+    return f"{title}\n\n{chunk_text}"
+
+
+def _backfill_enriched_text(conn: sqlite3.Connection) -> None:
+    """Backfill enriched_text for all existing chunks."""
+    rows = conn.execute(
+        "SELECT c.id, c.text, c.metadata, d.title "
+        "FROM chunks c JOIN documents d ON c.document_id = d.id"
+    ).fetchall()
+    for row in rows:
+        metadata = json.loads(row["metadata"]) if row["metadata"] else None
+        enriched = build_enriched_text(row["title"], row["text"], metadata)
+        conn.execute("UPDATE chunks SET enriched_text = ? WHERE id = ?", (enriched, row["id"]))
+
+
+def _rebuild_fts(conn: sqlite3.Connection) -> None:
+    """Drop and recreate chunks_fts to index enriched_text, with updated triggers."""
+    conn.executescript("""
+        DROP TRIGGER IF EXISTS chunks_ai;
+        DROP TRIGGER IF EXISTS chunks_ad;
+        DROP TRIGGER IF EXISTS chunks_au;
+        DROP TABLE IF EXISTS chunks_fts;
+
+        CREATE VIRTUAL TABLE chunks_fts USING fts5(
+            text,
+            content=chunks,
+            content_rowid=id
+        );
+
+        CREATE TRIGGER chunks_ai AFTER INSERT ON chunks BEGIN
+            INSERT INTO chunks_fts(rowid, text) VALUES (new.id, new.enriched_text);
+        END;
+
+        CREATE TRIGGER chunks_ad AFTER DELETE ON chunks BEGIN
+            INSERT INTO chunks_fts(chunks_fts, rowid, text) VALUES ('delete', old.id, old.enriched_text);
+        END;
+
+        CREATE TRIGGER chunks_au AFTER UPDATE ON chunks BEGIN
+            INSERT INTO chunks_fts(chunks_fts, rowid, text) VALUES ('delete', old.id, old.enriched_text);
+            INSERT INTO chunks_fts(rowid, text) VALUES (new.id, new.enriched_text);
+        END;
+    """)
+    # Repopulate FTS from existing enriched_text
+    conn.execute("INSERT INTO chunks_fts(rowid, text) SELECT id, enriched_text FROM chunks")
+
+
 def get_connection(db_path: str) -> sqlite3.Connection:
    """Return a sqlite3 connection with WAL mode, Row factory, and foreign keys enabled."""
    import sqlite_vec
@@ -34,6 +88,8 @@ def init_schema(conn: sqlite3.Connection, embedding_dim: int) -> None:
            content_hash TEXT UNIQUE,
            doc_type TEXT,
            language TEXT,
+            stored_path TEXT,
+            original_filename TEXT,
            created_at TEXT DEFAULT current_timestamp
        );

@@ -42,6 +98,7 @@ def init_schema(conn: sqlite3.Connection, embedding_dim: int) -> None:
            document_id INTEGER REFERENCES documents(id) ON DELETE CASCADE,
            chunk_index INTEGER,
            text TEXT,
+            enriched_text TEXT,
            token_count INTEGER,
            metadata TEXT DEFAULT '{{}}',
            UNIQUE(document_id, chunk_index)
@@ -53,18 +110,18 @@ def init_schema(conn: sqlite3.Connection, embedding_dim: int) -> None:
            content_rowid=id
        );

-        -- Triggers to keep FTS index in sync with chunks table
+        -- Triggers to keep FTS index in sync with chunks table (using enriched_text)
        CREATE TRIGGER IF NOT EXISTS chunks_ai AFTER INSERT ON chunks BEGIN
-            INSERT INTO chunks_fts(rowid, text) VALUES (new.id, new.text);
+            INSERT INTO chunks_fts(rowid, text) VALUES (new.id, new.enriched_text);
        END;

        CREATE TRIGGER IF NOT EXISTS chunks_ad AFTER DELETE ON chunks BEGIN
-            INSERT INTO chunks_fts(chunks_fts, rowid, text) VALUES ('delete', old.id, old.text);
+            INSERT INTO chunks_fts(chunks_fts, rowid, text) VALUES ('delete', old.id, old.enriched_text);
        END;

        CREATE TRIGGER IF NOT EXISTS chunks_au AFTER UPDATE ON chunks BEGIN
-            INSERT INTO chunks_fts(chunks_fts, rowid, text) VALUES ('delete', old.id, old.text);
-            INSERT INTO chunks_fts(rowid, text) VALUES (new.id, new.text);
+            INSERT INTO chunks_fts(chunks_fts, rowid, text) VALUES ('delete', old.id, old.enriched_text);
+            INSERT INTO chunks_fts(rowid, text) VALUES (new.id, new.enriched_text);
        END;

        CREATE TABLE IF NOT EXISTS tags (
@@ -114,6 +171,20 @@ def init_schema(conn: sqlite3.Connection, embedding_dim: int) -> None:
    if "content_hash" not in cols:
        conn.execute("ALTER TABLE jobs ADD COLUMN content_hash TEXT")

+    # Migrate: add stored_path and original_filename to documents if missing
+    doc_cols = {row[1] for row in conn.execute("PRAGMA table_info(documents)").fetchall()}
+    if "stored_path" not in doc_cols:
+        conn.execute("ALTER TABLE documents ADD COLUMN stored_path TEXT")
+    if "original_filename" not in doc_cols:
+        conn.execute("ALTER TABLE documents ADD COLUMN original_filename TEXT")
+
+    # Migrate: add enriched_text to chunks and rebuild FTS to index it
+    chunk_cols = {row[1] for row in conn.execute("PRAGMA table_info(chunks)").fetchall()}
+    if "enriched_text" not in chunk_cols:
+        conn.execute("ALTER TABLE chunks ADD COLUMN enriched_text TEXT")
+        _backfill_enriched_text(conn)
+        _rebuild_fts(conn)
+
    conn.commit()


@@ -196,6 +267,7 @@ def insert_chunk(
    document_id: int,
    chunk_index: int,
    text: str,
+    enriched_text: str | None = None,
    token_count: Optional[int] = None,
    metadata: Any = None,
 ) -> int:
@@ -208,8 +280,8 @@ def insert_chunk(
        metadata_str = str(metadata)

    cur = conn.execute(
-        "INSERT INTO chunks(document_id, chunk_index, text, token_count, metadata) VALUES (?, ?, ?, ?, ?)",
-        (document_id, chunk_index, text, token_count, metadata_str),
+        "INSERT INTO chunks(document_id, chunk_index, text, enriched_text, token_count, metadata) VALUES (?, ?, ?, ?, ?, ?)",
+        (document_id, chunk_index, text, enriched_text or text, token_count, metadata_str),
    )
    conn.commit()
    return cur.lastrowid
@@ -1,14 +1,20 @@
 """Document management endpoints — list, view, and delete documents."""

 import json
+import logging
+import mimetypes
+from pathlib import Path
 from typing import Optional

 from fastapi import HTTPException, Query
+from fastapi.responses import FileResponse

 from main import app
 from kb.config import cfg
 from kb.database import get_connection

+logger = logging.getLogger("kb.routes.documents")
+

@app.get("/api/v1/documents")
 async def list_documents(
@@ -100,8 +106,12 @@ async def get_document(doc_id: int):
            (doc_id,),
        ).fetchall()

+        stored_path = doc["stored_path"]
+        has_file = bool(stored_path and Path(stored_path).exists())
+
        return {
            **dict(doc),
+            "has_file": has_file,
            "tags": [t["name"] for t in tag_rows],
            "chunks": [dict(c) for c in chunks],
        }
@@ -109,12 +119,53 @@ async def get_document(doc_id: int):
        conn.close()


+@app.get("/api/v1/documents/{doc_id}/file")
+async def download_document_file(doc_id: int):
+    conn = get_connection(cfg.db_path)
+    try:
+        doc = conn.execute(
+            "SELECT id, title, stored_path, original_filename FROM documents WHERE id = ?",
+            (doc_id,),
+        ).fetchone()
+        if not doc:
+            raise HTTPException(status_code=404, detail="Document not found.")
+
+        stored_path = doc["stored_path"]
+        if not stored_path:
+            raise HTTPException(
+                status_code=404,
+                detail="Original file not available - ingested before document storage was enabled.",
+            )
+
+        file_path = Path(stored_path)
+        if not file_path.exists():
+            raise HTTPException(
+                status_code=404,
+                detail="Stored file not found on disk.",
+            )
+
+        original_filename = doc["original_filename"]
+        if not original_filename:
+            ext = file_path.suffix
+            original_filename = (doc["title"] or "document") + ext
+
+        media_type = mimetypes.guess_type(original_filename)[0] or "application/octet-stream"
+
+        return FileResponse(
+            path=str(file_path),
+            media_type=media_type,
+            filename=original_filename,
+        )
+    finally:
+        conn.close()
+
+
@app.delete("/api/v1/documents/{doc_id}")
 async def delete_document(doc_id: int):
    conn = get_connection(cfg.db_path)
    try:
        doc = conn.execute(
-            "SELECT id, title FROM documents WHERE id = ?", (doc_id,)
+            "SELECT id, title, stored_path FROM documents WHERE id = ?", (doc_id,)
        ).fetchone()
        if not doc:
            raise HTTPException(status_code=404, detail="Document not found.")
@@ -134,6 +185,19 @@ async def delete_document(doc_id: int):
        conn.execute("DELETE FROM documents WHERE id = ?", (doc_id,))
        conn.commit()

+        # Delete stored file from disk
+        stored_path = doc["stored_path"]
+        if stored_path:
+            try:
+                file_path = Path(stored_path)
+                if file_path.exists():
+                    file_path.unlink()
+                    logger.info("Deleted stored file: %s", stored_path)
+                else:
+                    logger.warning("Stored file already missing: %s", stored_path)
+            except OSError as exc:
+                logger.warning("Failed to delete stored file %s: %s", stored_path, exc)
+
        return {
            "status": "deleted",
            "document_id": doc_id,
@@ -19,10 +19,10 @@ async def reindex():

    conn = get_connection(cfg.db_path)
    try:
-        # Fetch all chunks
-        rows = conn.execute("SELECT id, text FROM chunks ORDER BY id").fetchall()
+        # Fetch all chunks — use enriched_text for embedding (includes title context)
+        rows = conn.execute("SELECT id, enriched_text FROM chunks ORDER BY id").fetchall()
        chunk_ids = [row["id"] for row in rows]
-        chunk_texts = [row["text"] for row in rows]
+        chunk_texts = [row["enriched_text"] or "" for row in rows]

        logger.info("Reindexing %d chunks with model '%s'", len(chunk_ids), cfg.model)

@@ -4,9 +4,11 @@ import asyncio
 import hashlib
 import json
 import logging
+import shutil
 from pathlib import Path

 from kb import config, database, embeddings, staging
+from kb.database import build_enriched_text
 from kb.ingest import detector

 logger = logging.getLogger("kb.worker")
@@ -145,20 +147,30 @@ def _process_job(job_row) -> tuple[str, int | None, int]:
        )

        chunk_texts = [c if isinstance(c, str) else c["text"] for c in chunks]
-        vectors = embeddings.embed_texts(chunk_texts)
+        chunk_metas = []
+        for idx, c in enumerate(chunks):
+            if isinstance(c, str):
+                chunk_metas.append(None)
+            else:
+                meta = {k: v for k, v in c.items() if k != "text"} or None
+                chunk_metas.append(meta)

-        for idx, (chunk_text, vector) in enumerate(zip(chunk_texts, vectors)):
-            metadata = None
-            if not isinstance(chunks[idx], str):
-                metadata = {
-                    k: v for k, v in chunks[idx].items() if k != "text"
-                } or None
+        enriched_texts = [
+            build_enriched_text(title, ct, cm)
+            for ct, cm in zip(chunk_texts, chunk_metas)
+        ]
+        vectors = embeddings.embed_texts(enriched_texts)
+
+        for idx, (chunk_text, enriched, vector) in enumerate(
+            zip(chunk_texts, enriched_texts, vectors)
+        ):
            chunk_id = database.insert_chunk(
                conn,
                document_id=doc_id,
                chunk_index=idx,
                text=chunk_text,
-                metadata=metadata,
+                enriched_text=enriched,
+                metadata=chunk_metas[idx],
            )
            database.insert_embedding(conn, chunk_id, vector)

@@ -168,8 +180,31 @@ def _process_job(job_row) -> tuple[str, int | None, int]:
            database.tag_document(conn, doc_id, tags)

        conn.commit()
+
+        # --- Move original file to persistent storage ---------------------
+        ext = Path(filename).suffix or staged_path.suffix
+        dest = cfg.documents_dir / f"{content_hash}{ext}"
+        try:
+            cfg.documents_dir.mkdir(parents=True, exist_ok=True)
+            shutil.move(str(staged_path), str(dest))
+            conn_update = database.get_connection(cfg.db_path)
+            try:
+                conn_update.execute(
+                    "UPDATE documents SET stored_path = ?, original_filename = ? WHERE id = ?",
+                    (str(dest), filename, doc_id),
+                )
+                conn_update.commit()
+            finally:
+                conn_update.close()
+            logger.info("Stored original file: %s", dest)
+        except Exception as exc:
+            logger.warning("Failed to store original file: %s", exc)
+            staging.cleanup(staged_path)
+
        return ("done", doc_id, len(chunk_texts))

    finally:
        conn.close()
-        staging.cleanup(staged_path)
+        # Only clean up staging if the file is still there (not moved)
+        if staged_path.exists():
+            staging.cleanup(staged_path)
@@ -0,0 +1,223 @@
+"""Tests for original document storage feature."""
+
+import hashlib
+import shutil
+import sqlite3
+from pathlib import Path
+from unittest.mock import patch
+
+import pytest
+from fastapi.testclient import TestClient
+
+
+@pytest.fixture
+def data_dir(tmp_path):
+    """Create a temporary data directory with required subdirectories."""
+    staging = tmp_path / "staging"
+    staging.mkdir()
+    documents = tmp_path / "documents"
+    documents.mkdir()
+    return tmp_path
+
+
+@pytest.fixture
+def db_conn(data_dir):
+    """Create an in-memory-style SQLite DB with the full schema."""
+    db_path = data_dir / "kb.db"
+    conn = sqlite3.connect(str(db_path))
+    conn.row_factory = sqlite3.Row
+    conn.execute("PRAGMA foreign_keys=ON")
+    conn.executescript("""
+        CREATE TABLE IF NOT EXISTS documents (
+            id INTEGER PRIMARY KEY,
+            title TEXT,
+            source_path TEXT,
+            content_hash TEXT UNIQUE,
+            doc_type TEXT,
+            language TEXT,
+            stored_path TEXT,
+            original_filename TEXT,
+            created_at TEXT DEFAULT current_timestamp
+        );
+
+        CREATE TABLE IF NOT EXISTS chunks (
+            id INTEGER PRIMARY KEY,
+            document_id INTEGER REFERENCES documents(id) ON DELETE CASCADE,
+            chunk_index INTEGER,
+            text TEXT,
+            token_count INTEGER,
+            metadata TEXT DEFAULT '{}',
+            UNIQUE(document_id, chunk_index)
+        );
+
+        CREATE TABLE IF NOT EXISTS tags (
+            id INTEGER PRIMARY KEY,
+            name TEXT UNIQUE COLLATE NOCASE
+        );
+
+        CREATE TABLE IF NOT EXISTS document_tags (
+            document_id INTEGER REFERENCES documents(id) ON DELETE CASCADE,
+            tag_id INTEGER REFERENCES tags(id) ON DELETE CASCADE,
+            UNIQUE(document_id, tag_id)
+        );
+
+        CREATE TABLE IF NOT EXISTS jobs (
+            id INTEGER PRIMARY KEY,
+            filename TEXT,
+            status TEXT DEFAULT 'queued',
+            doc_type TEXT,
+            tags_json TEXT DEFAULT '[]',
+            title TEXT,
+            error TEXT,
+            document_id INTEGER,
+            chunk_count INTEGER DEFAULT 0,
+            staging_path TEXT,
+            content_hash TEXT,
+            created_at TEXT DEFAULT current_timestamp,
+            completed_at TEXT
+        );
+    """)
+    conn.commit()
+    yield conn
+    conn.close()
+
+
+@pytest.fixture
+def sample_pdf(data_dir):
+    """Create a fake PDF file in staging."""
+    content = b"%PDF-1.4 fake pdf content for testing"
+    staging = data_dir / "staging"
+    path = staging / "test_upload.pdf"
+    path.write_bytes(content)
+    return path, content
+
+
+class TestWorkerFileStorage:
+    """Tests for worker moving files to persistent storage."""
+
+    def test_successful_ingestion_stores_file(self, data_dir, db_conn, sample_pdf):
+        """7.1 - Test successful ingestion stores file at expected path."""
+        staged_path, content = sample_pdf
+        content_hash = hashlib.sha256(content).hexdigest()
+        documents_dir = data_dir / "documents"
+
+        expected_dest = documents_dir / f"{content_hash}.pdf"
+
+        # Simulate what the worker does: move file to documents dir
+        shutil.move(str(staged_path), str(expected_dest))
+
+        assert expected_dest.exists()
+        assert expected_dest.read_bytes() == content
+        assert not staged_path.exists()
+
+        # Simulate DB update
+        db_conn.execute(
+            "INSERT INTO documents(title, source_path, content_hash, doc_type, stored_path, original_filename) "
+            "VALUES (?, ?, ?, ?, ?, ?)",
+            ("Test PDF", str(staged_path), content_hash, "pdf", str(expected_dest), "test_upload.pdf"),
+        )
+        db_conn.commit()
+
+        row = db_conn.execute("SELECT stored_path, original_filename FROM documents WHERE content_hash = ?", (content_hash,)).fetchone()
+        assert row["stored_path"] == str(expected_dest)
+        assert row["original_filename"] == "test_upload.pdf"
+
+    def test_failed_ingestion_no_file_in_documents(self, data_dir, sample_pdf):
+        """7.2 - Test failed ingestion does not leave file in documents dir."""
+        staged_path, _ = sample_pdf
+        documents_dir = data_dir / "documents"
+
+        # Simulate failure: staging file gets cleaned up, nothing in documents dir
+        staged_path.unlink()
+
+        assert len(list(documents_dir.iterdir())) == 0
+
+    def test_document_deletion_removes_stored_file(self, data_dir, db_conn, sample_pdf):
+        """7.4 - Test document deletion removes stored file."""
+        staged_path, content = sample_pdf
+        content_hash = hashlib.sha256(content).hexdigest()
+        documents_dir = data_dir / "documents"
+
+        dest = documents_dir / f"{content_hash}.pdf"
+        shutil.move(str(staged_path), str(dest))
+
+        db_conn.execute(
+            "INSERT INTO documents(title, source_path, content_hash, doc_type, stored_path, original_filename) "
+            "VALUES (?, ?, ?, ?, ?, ?)",
+            ("Test PDF", str(staged_path), content_hash, "pdf", str(dest), "test_upload.pdf"),
+        )
+        db_conn.commit()
+
+        # Simulate delete: remove from DB and disk
+        doc = db_conn.execute("SELECT id, stored_path FROM documents WHERE content_hash = ?", (content_hash,)).fetchone()
+        stored = Path(doc["stored_path"])
+        db_conn.execute("DELETE FROM documents WHERE id = ?", (doc["id"],))
+        db_conn.commit()
+
+        if stored.exists():
+            stored.unlink()
+
+        assert not stored.exists()
+        assert db_conn.execute("SELECT COUNT(*) FROM documents", ()).fetchone()[0] == 0
+
+    def test_download_404_for_document_without_stored_file(self, db_conn):
+        """7.5 - Test download returns 404 for documents without stored files."""
+        db_conn.execute(
+            "INSERT INTO documents(title, source_path, content_hash, doc_type) "
+            "VALUES (?, ?, ?, ?)",
+            ("Old Doc", "/tmp/gone", "abc123", "pdf"),
+        )
+        db_conn.commit()
+
+        row = db_conn.execute("SELECT stored_path FROM documents WHERE content_hash = 'abc123'").fetchone()
+        assert row["stored_path"] is None
+
+
+class TestFileDownloadEndpoint:
+    """Tests for the /api/v1/documents/{id}/file endpoint logic."""
+
+    def test_file_response_uses_original_filename(self, data_dir, db_conn, sample_pdf):
+        """7.3 - Test file download uses correct original filename."""
+        staged_path, content = sample_pdf
+        content_hash = hashlib.sha256(content).hexdigest()
+        documents_dir = data_dir / "documents"
+
+        dest = documents_dir / f"{content_hash}.pdf"
+        shutil.move(str(staged_path), str(dest))
+
+        db_conn.execute(
+            "INSERT INTO documents(title, source_path, content_hash, doc_type, stored_path, original_filename) "
+            "VALUES (?, ?, ?, ?, ?, ?)",
+            ("My Report", str(staged_path), content_hash, "pdf", str(dest), "quarterly_report.pdf"),
+        )
+        db_conn.commit()
+
+        doc = db_conn.execute("SELECT stored_path, original_filename, title FROM documents WHERE content_hash = ?", (content_hash,)).fetchone()
+
+        # Verify the original filename is preserved and different from title
+        assert doc["original_filename"] == "quarterly_report.pdf"
+        assert doc["title"] == "My Report"
+        assert Path(doc["stored_path"]).exists()
+
+    def test_fallback_to_title_when_no_original_filename(self, data_dir, db_conn):
+        """Test that title+ext is used when original_filename is NULL."""
+        documents_dir = data_dir / "documents"
+        fake_file = documents_dir / "somehash.pdf"
+        fake_file.write_bytes(b"fake")
+
+        db_conn.execute(
+            "INSERT INTO documents(title, source_path, content_hash, doc_type, stored_path) "
+            "VALUES (?, ?, ?, ?, ?)",
+            ("Engine Manual", "/tmp/old", "hash456", "pdf", str(fake_file)),
+        )
+        db_conn.commit()
+
+        doc = db_conn.execute("SELECT original_filename, title, stored_path FROM documents WHERE content_hash = 'hash456'").fetchone()
+
+        # When original_filename is NULL, the endpoint should fall back to title + ext
+        original_filename = doc["original_filename"]
+        if not original_filename:
+            ext = Path(doc["stored_path"]).suffix
+            original_filename = (doc["title"] or "document") + ext
+
+        assert original_filename == "Engine Manual.pdf"
@@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-03-27
@@ -0,0 +1,84 @@
+## Context
+
+Currently, uploaded files pass through a staging directory and are deleted after the worker extracts chunks and embeddings. The `documents.source_path` column stores the (now-stale) staging path. Users who want the original file must re-source it externally. The data directory structure today is:
+
+```
+/data/
+  kb.db
+  hf_cache/
+  staging/      # temporary, cleaned after processing
+```
+
+## Goals / Non-Goals
+
+**Goals:**
+- Persist every successfully-ingested original file for the lifetime of the document
+- Serve the original file via API (`GET /api/v1/documents/{id}/file`)
+- Clean up stored files when a document is deleted
+- Work transparently with the existing Docker volume mount (`/data`)
+
+**Non-Goals:**
+- Serving transformed/converted versions of documents (e.g. PDF→HTML)
+- De-duplicating file storage (same content hash = same row, so 1:1 is fine)
+- Compression or archival of stored files
+- Retroactive storage of files ingested before this change (they're already gone)
+
+## Decisions
+
+### 1. Storage layout: content-hash-based flat directory
+
+Store files at `{data_dir}/documents/{content_hash}{ext}` (e.g. `documents/a1b2c3...d4.pdf`).
+
+**Why over document-ID naming:** Content hash is available at staging time before the DB row exists, avoids race conditions, and makes dedup trivially safe (same hash = same file, overwrite is harmless). The hash is already computed for dedup checks.
+
+**Why flat over nested:** The KB is a personal tool — expected scale is hundreds to low-thousands of documents. A flat directory is simpler and sufficient. If needed later, a `ab/cd/` prefix scheme is easy to add.
+
+**Alternatives considered:**
+- *Store in SQLite as BLOBs*: Bloats the DB, complicates backups, and degrades WAL performance for large files. Rejected.
+- *Keep the staging path as-is*: Staging uses UUID prefixes which are meaningless; content-hash naming is deterministic and self-deduplicating.
+
+### 2. Move file from staging to documents dir (not copy)
+
+Use `shutil.move()` from staging to documents dir after successful ingestion, before `staging.cleanup()`. This avoids doubling disk usage during processing.
+
+**Why not copy-then-delete:** Move is atomic on the same filesystem (which `/data/staging` and `/data/documents` share). Faster, no temporary disk spike.
+
+### 3. New columns `stored_path` and `original_filename` on `documents` table
+
+Add two nullable columns:
+- `stored_path TEXT` — permanent file location on disk
+- `original_filename TEXT` — the exact filename from the upload (e.g. `report.pdf`)
+
+Both are nullable because existing documents (ingested before this change) won't have values.
+
+**Why `original_filename` separate from `title`:** The `title` field can be user-overridden (e.g. "Engine Manual" instead of `report.pdf`). When serving the file for download, the `Content-Disposition` header should use the original filename so the downloaded file has the correct name and extension. The `original_filename` is sourced from `jobs.filename` which is already captured at upload time.
+
+Keep `source_path` as-is for backward compatibility (it records what the staging path was). `stored_path` is the permanent location.
+
+**Migration:** Two `ALTER TABLE` statements — safe additive migrations, no data rewrite needed.
+
+### 4. File download endpoint returns the file directly
+
+`GET /api/v1/documents/{id}/file` uses FastAPI's `FileResponse` with:
+- `media_type` derived from the file extension
+- `Content-Disposition: attachment; filename="{original_filename}"` (falls back to `{title}{ext}` if `original_filename` is NULL)
+- Returns 404 if `stored_path` is NULL or file is missing from disk
+
+### 5. Delete cascades to file removal
+
+When `DELETE /api/v1/documents/{id}` is called, delete the stored file from disk after the DB delete succeeds. If file removal fails (already gone, permissions), log a warning but don't fail the API call — the DB is the source of truth.
+
+## Risks / Trade-offs
+
+- **Disk usage increases** — every ingested file persists. For the personal-use scale this is expected and acceptable. Users manage this via document deletion.
+  → Mitigation: Document the storage behavior; `GET /api/v1/status` already shows DB size, could add documents-dir size later.
+
+- **Pre-existing documents have no stored file** — `stored_path` will be NULL for documents ingested before this change.
+  → Mitigation: The download endpoint returns 404 with a clear message ("original file not available — ingested before document storage was enabled"). No attempt to backfill.
+
+- **File-DB consistency** — crash between DB commit and file move could leave orphan staged files or missing stored files.
+  → Mitigation: Move file first, then commit DB. If DB commit fails, the file in documents dir is harmless (orphan cleanup can be added later). If move fails, the job fails and staged file remains for retry.
+
+## Open Questions
+
+None — the scope is straightforward enough to proceed.
@@ -0,0 +1,30 @@
+## Why
+
+The knowledge base currently discards original files after chunking and embedding. Once a document is ingested, only the extracted text chunks and vectors remain — the original PDF, markdown, or code file is deleted from staging. Users cannot retrieve the source document from the KB, which limits its usefulness as a document store and prevents use cases like re-processing with a different model or serving the original file to downstream tools.
+
+## What Changes
+
+- Add a persistent document storage directory (`{data_dir}/documents/`) alongside the SQLite database
+- After successful ingestion, copy the original file from staging to permanent storage instead of deleting it
+- Store the permanent file path in the `documents` table (`stored_path` column) and the original upload filename (`original_filename` column) so downloads use the correct name
+- Add an API endpoint to download the original file by document ID
+- Add a CLI command to export/retrieve the original document
+- **BREAKING**: Delete document now also removes the stored file from disk
+- Notes (text-only) are stored as `.note` files in the same directory for consistency
+
+## Capabilities
+
+### New Capabilities
+- `document-storage`: Persistent storage of original uploaded files on disk, lifecycle management (store on ingest, delete on document removal), and retrieval via API
+
+### Modified Capabilities
+- `engine-api`: New endpoint `GET /api/v1/documents/{id}/file` to download the original file; delete endpoint must also clean up stored files; ingestion worker stores files instead of discarding them
+
+## Impact
+
+- **Engine config**: New `documents_dir` property on Config, new directory created at startup via `ensure_dirs()`
+- **Worker**: After successful chunking, move/copy file from staging to documents dir; update `source_path` → `stored_path` with permanent location
+- **Database schema**: Add `stored_path` and `original_filename` columns to `documents` table (migration for existing DBs)
+- **Routes**: New file-download endpoint; update delete handler to remove stored file
+- **Go client**: New `export` / `get-file` subcommand to download original documents
+- **Docker**: `documents/` directory lives inside the existing `/data` volume — no new mounts needed
@@ -0,0 +1,83 @@
+## ADDED Requirements
+
+### Requirement: Persistent original file storage
+
+The engine SHALL persistently store the original uploaded file on disk after successful ingestion. Files SHALL be stored at `{data_dir}/documents/{content_hash}{extension}` where `content_hash` is the SHA-256 hex digest already computed for dedup and `extension` is preserved from the original filename. The `documents` table SHALL record the stored file path in a `stored_path` column and the original upload filename in an `original_filename` column.
+
+#### Scenario: File stored after successful ingestion
+- **WHEN** the background worker successfully processes an ingestion job for a PDF file
+- **THEN** the worker SHALL move the staged file to `{data_dir}/documents/{content_hash}.pdf`, store the permanent path in `documents.stored_path`, store the original filename in `documents.original_filename`, and delete the staging entry
+
+#### Scenario: Note stored after successful ingestion
+- **WHEN** the background worker successfully processes an ingestion job for a text note
+- **THEN** the worker SHALL move the staged `.note` file to `{data_dir}/documents/{content_hash}.note` and store the permanent path in `documents.stored_path`
+
+#### Scenario: Markdown file stored after successful ingestion
+- **WHEN** the background worker successfully processes an ingestion job for a markdown file
+- **THEN** the worker SHALL move the staged file to `{data_dir}/documents/{content_hash}.md` and store the permanent path in `documents.stored_path`
+
+#### Scenario: Code file stored after successful ingestion
+- **WHEN** the background worker successfully processes an ingestion job for a code file (e.g. `.py`, `.go`)
+- **THEN** the worker SHALL move the staged file to `{data_dir}/documents/{content_hash}{original_extension}` and store the permanent path in `documents.stored_path`
+
+#### Scenario: Documents directory created at startup
+- **WHEN** the engine starts up and calls `ensure_dirs()`
+- **THEN** the `{data_dir}/documents/` directory SHALL be created if it does not exist
+
+#### Scenario: Ingestion failure does not store file
+- **WHEN** the background worker fails to process an ingestion job
+- **THEN** the staged file SHALL be cleaned up as before and no file SHALL be written to the documents directory
+
+---
+
+### Requirement: File retrieval via API
+
+The engine SHALL serve the original stored file for any document that has a stored file on disk.
+
+#### Scenario: Download original file
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file` for a document with a stored file
+- **THEN** the engine SHALL return the file with appropriate `Content-Type` based on file extension and `Content-Disposition: attachment; filename="{original_filename}"` header, falling back to `{title}{ext}` if `original_filename` is NULL
+
+#### Scenario: Download file for pre-existing document
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file` for a document ingested before this feature was added (stored_path is NULL)
+- **THEN** the engine SHALL return HTTP 404 with `{"error": "Original file not available - ingested before document storage was enabled"}`
+
+#### Scenario: Download file when file missing from disk
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file` for a document whose `stored_path` is set but the file no longer exists on disk
+- **THEN** the engine SHALL return HTTP 404 with `{"error": "Stored file not found on disk"}`
+
+#### Scenario: Download file for non-existent document
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file` with a non-existent document ID
+- **THEN** the engine SHALL return HTTP 404 with `{"error": "Document not found"}`
+
+---
+
+### Requirement: File cleanup on document deletion
+
+The engine SHALL remove the stored original file from disk when a document is deleted.
+
+#### Scenario: Delete document with stored file
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}` for a document with a stored file
+- **THEN** the engine SHALL delete the document from the database (cascading to chunks, embeddings, tags) AND delete the stored file from disk
+
+#### Scenario: Delete document when stored file already missing
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}` for a document whose stored file has been manually removed from disk
+- **THEN** the engine SHALL delete the document from the database successfully and log a warning about the missing file
+
+#### Scenario: Delete document without stored file (pre-existing)
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}` for a document with `stored_path` NULL
+- **THEN** the engine SHALL delete the document from the database without attempting file removal
+
+---
+
+### Requirement: Database schema migration for stored_path and original_filename
+
+The engine SHALL add `stored_path` and `original_filename` columns to the `documents` table for tracking permanent file locations and original upload filenames.
+
+#### Scenario: Fresh database initialization
+- **WHEN** the engine initializes a new database
+- **THEN** the `documents` table SHALL include `stored_path TEXT` and `original_filename TEXT` columns in its schema
+
+#### Scenario: Existing database migration
+- **WHEN** the engine starts with a database created before this feature
+- **THEN** the engine SHALL add `stored_path TEXT` and `original_filename TEXT` to the `documents` table via `ALTER TABLE` if the columns do not exist
@@ -0,0 +1,61 @@
+## MODIFIED Requirements
+
+### Requirement: Background ingestion worker
+
+The engine SHALL run a background worker that processes queued jobs. The worker SHALL process one job at a time. For each job, it SHALL: detect document type, run the appropriate chunking pipeline (Docling for PDFs, header-based for Markdown, AST-based for code, whole-text for notes), generate embeddings using the resident model, insert chunks and vectors into the database, and move the original file to persistent storage.
+
+#### Scenario: Successful PDF ingestion
+- **WHEN** the background worker picks up a queued PDF job
+- **THEN** it SHALL update the job status to `processing`, run Docling conversion and chunking, embed all chunks, insert document and chunks into the database, move the staged file to `{data_dir}/documents/{content_hash}.pdf`, update `documents.stored_path` with the permanent path, store the original filename in `documents.original_filename`, update the job status to `done` with the resulting document_id and chunk count, and clean up the staging entry
+
+#### Scenario: Ingestion failure
+- **WHEN** the background worker encounters an error during processing (e.g., corrupt PDF)
+- **THEN** it SHALL update the job status to `failed` with the error message, delete the staged file, and continue processing the next queued job
+
+#### Scenario: Search during active ingestion
+- **WHEN** a search request arrives while the background worker is processing a job
+- **THEN** the search SHALL execute without blocking (SQLite WAL mode) and return results from already-ingested documents
+
+---
+
+### Requirement: Document management
+
+The engine SHALL provide endpoints to list, inspect, remove, and download original files for ingested documents.
+
+#### Scenario: List documents
+- **WHEN** a client sends `GET /api/v1/documents`
+- **THEN** the engine SHALL return a JSON array of documents with id, title, doc_type, tags, chunk_count, and created_at
+
+#### Scenario: List documents with filters
+- **WHEN** a client sends `GET /api/v1/documents?type=pdf&tags=manual`
+- **THEN** the engine SHALL return only documents matching all specified filters
+
+#### Scenario: Get document details
+- **WHEN** a client sends `GET /api/v1/documents/{id}`
+- **THEN** the engine SHALL return the full document record including all chunks, their text content, and whether the original file is available (`has_file: true/false`)
+
+#### Scenario: Download original file
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file`
+- **THEN** the engine SHALL return the original file with appropriate Content-Type and `Content-Disposition: attachment; filename="{original_filename}"` headers, or HTTP 404 if the file is not available
+
+#### Scenario: Remove a document
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}`
+- **THEN** the engine SHALL delete the document, all its chunks, associated embeddings, tag associations, and the stored original file from disk, and return HTTP 200 with a confirmation
+
+#### Scenario: Remove non-existent document
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}` with a non-existent ID
+- **THEN** the engine SHALL return HTTP 404
+
+---
+
+### Requirement: Engine configuration via environment variables
+
+The engine SHALL be configured via environment variables. No config file is read by the engine — all configuration comes from the environment (set via compose.yaml or Docker run).
+
+#### Scenario: Default configuration
+- **WHEN** the engine starts with no environment variables set
+- **THEN** it SHALL use defaults: data directory `/data`, model `all-MiniLM-L6-v2`, device `auto`, no API key required. It SHALL create `staging/` and `documents/` subdirectories under the data directory.
+
+#### Scenario: Custom model
+- **WHEN** `KB_MODEL` is set to `BAAI/bge-small-en-v1.5`
+- **THEN** the engine SHALL download and load that model instead of the default
@@ -0,0 +1,38 @@
+## 1. Config and Schema
+
+- [x] 1.1 Add `documents_dir` property to `Config` in `engine/kb/config.py` returning `{data_dir}/documents`
+- [x] 1.2 Add `documents_dir.mkdir()` to `Config.ensure_dirs()`
+- [x] 1.3 Add `stored_path TEXT` and `original_filename TEXT` columns to `documents` table in `init_schema()` (both CREATE TABLE and ALTER TABLE migration for existing DBs)
+
+## 2. Worker — File Persistence
+
+- [x] 2.1 In `worker._process_job()`, after successful DB commit, move staged file to `{documents_dir}/{content_hash}{ext}` using `shutil.move()`
+- [x] 2.2 Update `documents.stored_path` and `documents.original_filename` (from `jobs.filename`) after moving the file
+- [x] 2.3 Remove `staging.cleanup()` call for successful jobs (file is moved, not deleted); keep cleanup on failure path
+
+## 3. API — File Download Endpoint
+
+- [x] 3.1 Add `GET /api/v1/documents/{id}/file` route in `engine/kb/routes/documents.py` using FastAPI `FileResponse`
+- [x] 3.2 Return appropriate `Content-Type` from file extension and `Content-Disposition: attachment; filename="{original_filename}"` (fall back to `{title}{ext}` if NULL)
+- [x] 3.3 Handle 404 cases: document not found, `stored_path` is NULL, file missing from disk
+
+## 4. API — Delete Cleanup
+
+- [x] 4.1 Update `DELETE /api/v1/documents/{id}` in `engine/kb/routes/documents.py` to also delete the stored file from disk
+- [x] 4.2 Handle missing file gracefully (log warning, don't fail the request)
+
+## 5. Document Details Enhancement
+
+- [x] 5.1 Add `has_file` boolean to `GET /api/v1/documents/{id}` response based on `stored_path` presence and file existence on disk
+
+## 6. Go Client
+
+- [x] 6.1 Add `kb export <doc_id>` subcommand to the Go client that calls `GET /api/v1/documents/{id}/file` and writes to stdout or a specified output path
+
+## 7. Testing
+
+- [x] 7.1 Test successful ingestion stores file at expected path
+- [x] 7.2 Test failed ingestion does not leave file in documents dir
+- [x] 7.3 Test file download endpoint returns correct content and headers
+- [x] 7.4 Test document deletion removes stored file
+- [x] 7.5 Test download returns 404 for documents without stored files
@@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-03-29
@@ -0,0 +1,23 @@
+## Context
+
+The engine's `POST /api/v1/reindex` re-embeds all chunks synchronously and returns `{"chunks_reindexed": N, "model": "..."}`. The client has an established confirmation pattern in `remove.go` using `--yes`/`-y` flag.
+
+## Goals / Non-Goals
+
+**Goals:**
+- Add `kb reindex` with confirmation prompt matching `kb remove` pattern
+- Display human-readable and JSON output
+
+**Non-Goals:**
+- Progress reporting during reindex (engine returns synchronously)
+- Model selection from the client (model is engine-side config)
+
+## Decisions
+
+### 1. Confirmation prompt before reindex
+
+Reindex drops and rebuilds the vector table — destructive if interrupted. Use the same `[y/N]` prompt pattern as `kb remove`, skippable with `--yes`/`-y`.
+
+### 2. Warn that it may take a while
+
+The prompt should mention that reindex re-embeds all chunks, so the user knows it's not instant.
@@ -0,0 +1,22 @@
+## Why
+
+The engine exposes `POST /api/v1/reindex` but there's no client command for it. Users switching embedding models must use curl directly. Adding `kb reindex` with a confirmation prompt keeps it consistent with other destructive commands like `kb remove`.
+
+## What Changes
+
+- Add `kb reindex` command to the Go client with confirmation prompt (skip with `--yes`/`-y`)
+- Display reindex results (chunks reindexed, model used)
+
+## Capabilities
+
+### New Capabilities
+
+(none)
+
+### Modified Capabilities
+
+- `go-client`: Add reindex command requirement
+
+## Impact
+
+- New file: `client/cmd/reindex.go`
@@ -0,0 +1,25 @@
+## ADDED Requirements
+
+### Requirement: Reindex command
+
+The client SHALL provide a `kb reindex` command that triggers re-embedding of all chunks on the engine. The command SHALL prompt for confirmation before proceeding.
+
+#### Scenario: Reindex with confirmation
+- **WHEN** the user runs `kb reindex`
+- **THEN** the client SHALL display a warning that all chunks will be re-embedded and prompt `Reindex all chunks? This will re-embed everything. [y/N]`. If confirmed, it SHALL POST to `/api/v1/reindex` and display the result.
+
+#### Scenario: Reindex with skip confirmation
+- **WHEN** the user runs `kb reindex --yes`
+- **THEN** the client SHALL skip the confirmation prompt and POST to `/api/v1/reindex` immediately
+
+#### Scenario: Reindex cancelled
+- **WHEN** the user runs `kb reindex` and responds with anything other than `y` or `yes`
+- **THEN** the client SHALL print `Cancelled.` and exit with code 0
+
+#### Scenario: Reindex human output
+- **WHEN** the reindex completes successfully with default format
+- **THEN** the client SHALL print `Reindexed N chunks (model: <model_name>)`
+
+#### Scenario: Reindex JSON output
+- **WHEN** the user runs `kb reindex --yes --format json`
+- **THEN** the client SHALL output the raw JSON response from the engine
@@ -0,0 +1,5 @@
+## 1. Implementation
+
+- [x] 1.1 Create `client/cmd/reindex.go` with `kb reindex` command, `--yes`/`-y` flag, confirmation prompt matching `remove.go` pattern
+- [x] 1.2 POST to `/api/v1/reindex`, handle human output (`Reindexed N chunks (model: ...)`) and JSON output
+- [x] 1.3 Verify build compiles and command appears in `kb --help`
@@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-03-29
@@ -0,0 +1,43 @@
+## Context
+
+The `add` command currently handles both file uploads and notes via a `--note` string flag. This creates confusing flag parsing and a muddled help screen. The engine already auto-detects file type from extension (`detector.py`) and rejects unsupported ones, so the client's `--type` flag is redundant.
+
+## Goals / Non-Goals
+
+**Goals:**
+- `kb "my note"` as the sole note entry path (replaces `kb add --note`)
+- `kb addfile <path>` as a file-only upload command (replaces `kb add`)
+- Client-side extension validation before uploading
+- Clean, unambiguous help text for both paths
+
+**Non-Goals:**
+- Engine changes — type detection stays server-side
+- Backward compatibility shim for `kb add` — clean break
+- Client-side MIME type detection — extension check is sufficient
+
+## Decisions
+
+### Rename add → addfile, strip note/type flags
+
+Rename the cobra command from `add` to `addfile`. Remove `--note`, `--title`, and `--type` flags. Keep `--tags`, `--recursive`. The command becomes purely about file uploads.
+
+**Why not keep `add` as an alias?** Clean break is simpler. The old form was confusing — better to force a quick migration than maintain two paths.
+
+### Extension validation on single file uploads
+
+The `supportedExts` map already gates recursive walks. Apply the same check to single file uploads — reject with a clear error listing supported extensions. This gives instant feedback instead of a round-trip to the engine.
+
+### Root command RunE for note shorthand
+
+Use cobra's `Args: cobra.ArbitraryArgs` and `RunE` on the root command. When args are present and no subcommand matched, join all args into a single note string and submit. `--tags` flag on root for tagging notes. No `--title` — keep it minimal.
+
+**Why join all args?** `kb remember to update dns` (unquoted) should work the same as `kb "remember to update dns"`.
+
+### Reuse note submission logic via shared helper
+
+Extract `submitNote` from the current `runAdd` so both the root command and any future callers use the same POST + duplicate-handling + output logic.
+
+## Risks / Trade-offs
+
+- **Breaking change** → Anyone with `kb add` in scripts needs to update to `kb addfile`. Acceptable for a personal tool.
+- **No `--type` override** → If a user ever needs to force a type, they'd have to go through the engine API directly. Low risk since the engine's auto-detection covers all supported formats.
@@ -0,0 +1,34 @@
+## Why
+
+Adding a note requires `kb add --note "my note"` — too much ceremony for what should be instant. The `--note` flag taking a string value also creates confusing flag parsing (e.g. `kb add --note --tags foo` parses `--tags` as the note value). Meanwhile, `kb add` tries to do two things (files and notes) which muddies its help text and UX.
+
+Splitting these into distinct paths makes the CLI clearer:
+- **Notes**: `kb "my note"` — zero-friction, no subcommand needed
+- **Files**: `kb addfile report.pdf` — explicit, file-only command
+
+## What Changes
+
+- **Add `kb "text"` shorthand**: bare string arguments without a subcommand are treated as notes, submitted via `POST /api/v1/jobs`
+- **Rename `add` → `addfile`**: the command becomes file-only, no more `--note`/`--title` flags
+- **Drop `--type` flag**: the engine already auto-detects type from file extension (`detector.py`); the client doesn't need to override this
+- **Add client-side extension validation**: reject unsupported file extensions with a clear error before uploading, using the same extension set as recursive directory walks
+- **Update README**: document the new shorthand and renamed command
+- **BREAKING**: `kb add` no longer exists; `kb add --note` no longer exists
+
+## Capabilities
+
+### New Capabilities
+
+_(none)_
+
+### Modified Capabilities
+
+- `go-client`: Rename `add` to `addfile`, remove `--note`/`--title`/`--type` flags, add extension validation for single file uploads, add implicit note shorthand on root command
+
+## Impact
+
+- `client/cmd/add.go` → renamed/refactored to `addfile` command, stripped of note logic, added extension check
+- `client/cmd/root.go` — bare args handling + `--tags` flag for note shorthand
+- `README.md` — updated usage examples
+- No engine changes — engine already detects type from extension and rejects unsupported files
+- Breaking change for any scripts using `kb add` or `kb add --note`
@@ -0,0 +1,95 @@
+## ADDED Requirements
+
+### Requirement: Implicit note shorthand
+
+The client SHALL treat bare string arguments (with no subcommand) as an implicit note. `kb "my note"` SHALL behave identically to submitting a note via `POST /api/v1/jobs`. All persistent flags (`--format`, `--engine`, `--api-key`) and the root `--tags` flag SHALL work with the shorthand form.
+
+#### Scenario: Quick note via bare argument
+- **WHEN** the user runs `kb "remember to update DNS"`
+- **THEN** the client SHALL submit the text as a note via `POST /api/v1/jobs` and print `Queued: note`
+
+#### Scenario: Bare argument with tags
+- **WHEN** the user runs `kb "server room is building 3" --tags ops`
+- **THEN** the client SHALL submit the note with the specified tags
+
+#### Scenario: Bare argument with JSON output
+- **WHEN** the user runs `kb "my note" --format json`
+- **THEN** the client SHALL output the raw JSON response from the engine
+
+#### Scenario: Bare argument duplicate detection
+- **WHEN** the user runs `kb "my note"` and the engine returns HTTP 409
+- **THEN** the client SHALL handle the duplicate response identically to the previous `kb add --note` behaviour
+
+#### Scenario: Multiple unquoted words
+- **WHEN** the user runs `kb remember to update dns` (without quotes)
+- **THEN** the client SHALL join all arguments into a single note string and submit it
+
+#### Scenario: No interference with subcommands
+- **WHEN** the user runs `kb search "query"` or any other existing subcommand
+- **THEN** the client SHALL route to the subcommand as before — the implicit note shorthand SHALL NOT interfere
+
+#### Scenario: No arguments
+- **WHEN** the user runs `kb` with no arguments
+- **THEN** the client SHALL display the help text
+
+---
+
+## MODIFIED Requirements
+
+### Requirement: Add command (file and note ingestion)
+
+The client SHALL provide a `kb addfile` command that uploads files to the engine for async ingestion. The command SHALL validate file extensions before uploading and reject unsupported types. The client SHALL handle duplicate rejection (HTTP 409) and display the existing document information. The command SHALL NOT handle notes — notes are submitted via the implicit note shorthand (`kb "text"`).
+
+#### Scenario: Add a single file
+- **WHEN** the user runs `kb addfile report.pdf`
+- **THEN** the client SHALL validate the file extension, upload the file via `POST /api/v1/jobs` (multipart), print "Queued: report.pdf", and exit
+
+#### Scenario: Add a file with tags
+- **WHEN** the user runs `kb addfile manual.pdf --tags car,maintenance`
+- **THEN** the client SHALL include the tags in the multipart upload metadata
+
+#### Scenario: Add a directory recursively
+- **WHEN** the user runs `kb addfile ~/documents/ --recursive`
+- **THEN** the client SHALL discover all supported files in the directory tree, upload each one sequentially, and print "Queued: N files"
+
+#### Scenario: Unsupported file extension
+- **WHEN** the user runs `kb addfile photo.jpg`
+- **THEN** the client SHALL print an error listing supported extensions and exit with a non-zero code without making any API call
+
+#### Scenario: Duplicate file rejected (already ingested)
+- **WHEN** the user runs `kb addfile report.pdf` and the engine returns HTTP 409 with `{"error": "duplicate", "document_id": 42, "title": "report.pdf"}`
+- **THEN** the client SHALL print "Already imported: report.pdf (doc ID: 42)" and exit with code 0
+
+#### Scenario: Duplicate file rejected (in-flight job)
+- **WHEN** the user runs `kb addfile report.pdf` and the engine returns HTTP 409 with `{"error": "duplicate", "job_id": 7, "title": "report.pdf"}`
+- **THEN** the client SHALL print "Already queued: report.pdf (job ID: 7)" and exit with code 0
+
+#### Scenario: Duplicate file in recursive add
+- **WHEN** the user runs `kb addfile ~/documents/ --recursive` and some files are rejected as duplicates
+- **THEN** the client SHALL print the duplicate message for each rejected file, continue uploading remaining files, and include a summary (e.g., "Queued: 5 files, 2 duplicates skipped")
+
+#### Scenario: Duplicate with JSON output
+- **WHEN** the user runs `kb addfile report.pdf --format json` and the engine returns HTTP 409
+- **THEN** the client SHALL output the raw JSON response from the engine including the document_id and title
+
+#### Scenario: Add with JSON output
+- **WHEN** the user runs `kb addfile report.pdf --format json`
+- **THEN** the client SHALL output the JSON response from the engine including the job_id
+
+#### Scenario: File not found
+- **WHEN** the user runs `kb addfile nonexistent.pdf`
+- **THEN** the client SHALL print an error and exit with a non-zero code without making any API call
+
+#### Scenario: Upload failure
+- **WHEN** the upload fails (network error, engine returns 4xx/5xx other than 409)
+- **THEN** the client SHALL print the error and exit with a non-zero code
+
+## REMOVED Requirements
+
+### Requirement: Note ingestion via add command
+**Reason**: Notes are now submitted via the implicit note shorthand (`kb "text"`). The `--note` and `--title` flags on the add command are removed.
+**Migration**: Use `kb "my note"` or `kb "my note" --tags ops` instead of `kb add --note "my note" --tags ops`.
+
+### Requirement: Document type override via add command
+**Reason**: The engine auto-detects document type from file extension (`detector.py`). The client `--type` flag is redundant.
+**Migration**: Remove `--type` from scripts. The engine handles type detection automatically.
@@ -0,0 +1,19 @@
+## 1. Refactor note submission
+
+- [x] 1.1 Extract note submission logic from `runAdd` into a shared `submitNote` helper (multipart POST, duplicate detection, output formatting)
+
+## 2. Root command shorthand
+
+- [x] 2.1 Add `Args: cobra.ArbitraryArgs` and `RunE` to the root command — join args into a note string, call `submitNote`; show help when no args
+- [x] 2.2 Add `--tags` flag on the root command for note tagging
+
+## 3. Rename add → addfile
+
+- [x] 3.1 Rename command from `add` to `addfile` (`Use: "addfile <path>"`)
+- [x] 3.2 Remove `--note`, `--title`, and `--type` flags from the command
+- [x] 3.3 Add extension validation for single file uploads — reject unsupported extensions with a clear error listing supported types
+
+## 4. Documentation and verification
+
+- [x] 4.1 Update README.md usage section: show `kb "text"` shorthand, rename `add` references to `addfile`
+- [x] 4.2 Verify build compiles, `kb --help` and `kb addfile --help` show expected output
@@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-03-28
@@ -0,0 +1,93 @@
+## Context
+
+Currently the project uses a single version number shared between client and engine, managed by `release.sh`. Both `client/VERSION` and `engine/VERSION` are always bumped to the same value. A single git tag `vX.Y.Z` is created, and a single Gitea release bundles Go client binaries and Docker engine image references. This means any change to either component forces a full release of both.
+
+The client is a Go binary distributed as platform-specific downloads. The engine is a Python FastAPI server distributed as Docker images. They communicate over HTTP via `/api/v1/` endpoints. The engine already exposes its version via `GET /api/v1/status` → `{"version": "X.Y.Z", ...}`.
+
+## Goals / Non-Goals
+
+**Goals:**
+- Allow client and engine to have independent version numbers and release cadences
+- Provide a runtime compatibility check so users get a clear error when their client is too new for their engine
+- Split release tooling so each component can be released without touching the other
+
+**Non-Goals:**
+- API versioning beyond the existing `/api/v1/` path prefix
+- Backward-compatible negotiation or feature detection (client either works or fails)
+- Automatic upgrades or update notifications
+- Version checking in the other direction (engine requiring minimum client)
+
+## Decisions
+
+### 1. Tag naming: `client-vX.Y.Z` and `engine-vX.Y.Z`
+
+Prefix-style tags clearly identify which component a release belongs to and sort well in git tag listings.
+
+**Why over path-style (`client/vX.Y.Z`):** Slashes in git tags can cause issues with some tooling and are less conventional. Prefix-style is simpler and widely used in monorepos.
+
+**Why over separate repos:** The project is small and tightly coupled at the API level. A monorepo with prefixed tags keeps everything together while allowing independent releases.
+
+### 2. Two release scripts: `release-client.sh` and `release-engine.sh`
+
+Each script handles its own component end-to-end: version bump, build, tag, release, push.
+
+**Why over a single script with flags:** Two simple scripts are easier to understand and maintain than one script with component-selection logic. Each script is ~100 lines instead of one ~200-line script with branching. The shared logic (version helpers, pre-flight checks) is minimal and acceptable to duplicate.
+
+**Shared structure for both scripts:**
+1. Pre-flight checks (on main branch, tag doesn't exist)
+2. Version bump (reads/writes component's VERSION file only)
+3. Build artifacts (Go binaries or Docker images)
+4. Commit version bump, create prefixed tag, push
+5. Create Gitea release with assets
+6. (Engine only) Push Docker images
+
+### 3. `MinEngineVersion` as a build-time constant in the Go client
+
+The client embeds a `MinEngineVersion` string constant alongside the existing `Version` constant. It is set via `-ldflags` at build time, sourced from a `client/MIN_ENGINE_VERSION` file.
+
+**Why a separate file over embedding in `VERSION`:** The two values have different lifecycles. `VERSION` changes every release; `MIN_ENGINE_VERSION` changes only when the client starts using a new engine feature. A separate file makes the intent clear.
+
+**Why ldflags over hardcoding in Go source:** Consistent with how `Version` is already injected. The value lives in a plain text file that's easy to bump manually.
+
+### 4. Compatibility check on every API call via the `Client` struct
+
+The `api.Client` checks engine compatibility on its first HTTP call by hitting `GET /api/v1/status` and comparing the `version` field against `MinEngineVersion`. The result is cached on the `Client` instance — subsequent calls skip the check.
+
+**Flow:**
+1. First call to any `Client` method (Get/Post/Delete/Put)
+2. Before the actual request, call `GET /api/v1/status`
+3. Parse `version` from response
+4. Compare against `MinEngineVersion` using semver major.minor.patch comparison
+5. If engine version < min: print error to stderr, `os.Exit(1)`
+6. If check passes: set `versionChecked = true`, proceed with original request
+7. If status endpoint unreachable: proceed with original request (connectivity error will surface on the actual call)
+
+**Why hard fail, no skip flag:** This is a personal tool. If the client needs a newer engine, the user needs to update. A skip flag adds complexity for a scenario where the outcome (broken behavior) is worse than the error.
+
+**Why check on first API call, not at startup:** The `PersistentPreRunE` in cobra runs before every command, but some future commands might not need the engine (e.g. `kb version`, `kb help`). Checking in the `Client` ensures we only check when actually contacting the engine.
+
+**Why proceed when status endpoint is unreachable:** If we can't reach `/status`, the actual API call will also fail with a connection error. No point in double-failing. The compatibility check is for version mismatch, not connectivity.
+
+### 5. Compose files: use `build:` context, not pinned image tags
+
+The compose files currently use `build:` directives, not pre-built image references. Users who build locally don't need pinned tags — they're building from source. Users pulling pre-built images will reference the image tag directly in their own compose file or `docker run` command.
+
+**Decision:** Leave compose files as-is. Release notes for engine releases will include the exact `docker pull` command with the versioned tag.
+
+### 6. Semver comparison: major.minor.patch, no pre-release
+
+Compare versions as three integers. No support for pre-release suffixes (`-rc1`, `-beta`) — the project doesn't use them. If `MinEngineVersion` is `2.1.0` and engine reports `2.1.5`, the check passes. If engine reports `2.0.9`, it fails.
+
+## Risks / Trade-offs
+
+- **Extra HTTP round-trip on first command** — One additional `GET /api/v1/status` call per client invocation. Negligible for a local-network tool.
+  → Mitigation: Cached after first check within the Client instance.
+
+- **Developer must remember to bump `MIN_ENGINE_VERSION`** — When adding client code that depends on a new engine endpoint/field, the developer must manually update the file.
+  → Mitigation: This is a conscious decision point. The file's existence serves as a reminder. Could add a CI check later if needed.
+
+- **Breaking change to git tag format** — Existing `v2.0.x` tags won't match the new `client-v*` / `engine-v*` convention. Old tags remain in history.
+  → Mitigation: No migration needed. Old tags stay as historical artifacts. New convention starts from the first independent release.
+
+- **Two Gitea releases per coordinated release** — When both components change, two releases are created instead of one.
+  → Mitigation: Acceptable trade-off. Each release is self-contained with its own assets and notes.
@@ -0,0 +1,32 @@
+## Why
+
+Client and engine are currently locked to the same version number and released together via a single script. This means a client-only bug fix (e.g. output formatting) forces a full engine Docker image rebuild and push, and vice versa. Decoupling versions allows each component to be released independently on its own cadence, while a compatibility check ensures users don't run a client that requires engine features not yet deployed.
+
+## What Changes
+
+- **Separate version files** — `client/VERSION` and `engine/VERSION` may diverge (they already exist as separate files, but are currently always set to the same value)
+- **Split release script** — Replace single `release.sh` with `release-client.sh` (builds Go binaries, tags `client-vX.Y.Z`, creates release) and `release-engine.sh` (builds Docker images, tags `engine-vX.Y.Z`, creates release, pushes images)
+- **Client compatibility check** — Client embeds a `MinEngineVersion` constant (set at build time or in code). On every command that contacts the engine, the client calls `GET /api/v1/status`, compares the engine's reported version against `MinEngineVersion`, and hard-fails with an actionable error if the engine is too old. No skip flag, no warning — just a clear error with upgrade instructions.
+- **Tag naming convention** — `client-vX.Y.Z` and `engine-vX.Y.Z` replace the current `vX.Y.Z` tag format. **BREAKING** — existing tag format changes.
+
+## Capabilities
+
+### New Capabilities
+
+(none)
+
+### Modified Capabilities
+
+- `go-client`: Add engine version compatibility check requirement (hard fail if engine version < MinEngineVersion)
+- `engine-api`: Status endpoint already returns `version` — no change needed, but delta spec documents the contract that the version field is required for compatibility checking
+- `docker-deployment`: Compose files pin engine image tag; release script changes affect image tagging
+
+## Impact
+
+- `release.sh` — replaced by `release-client.sh` + `release-engine.sh`
+- `client/cmd/root.go` — new `MinEngineVersion` constant
+- `client/internal/api/client.go` — version check on first API call
+- `client/Makefile` — may inject `MinEngineVersion` via ldflags alongside `Version`
+- Git tags — new naming convention (`client-v*`, `engine-v*`)
+- Gitea releases — two separate releases per independent release cycle
+- `engine/compose.nvidia.yaml`, `engine/compose.rocm.yaml` — add pinned image tag
@@ -0,0 +1,25 @@
+## MODIFIED Requirements
+
+### Requirement: Compose files for deployment
+
+The project SHALL provide Docker Compose files for single-command deployment. Compose files SHALL use `build:` context for local development. Release notes SHALL document the versioned image tag for users pulling pre-built images.
+
+#### Scenario: Start NVIDIA deployment
+- **WHEN** an admin runs `docker compose -f compose.nvidia.yaml up -d`
+- **THEN** the engine SHALL start with GPU access, bind-mount the data directory, and be reachable on the configured port
+
+#### Scenario: Start ROCm deployment
+- **WHEN** an admin runs `docker compose -f compose.rocm.yaml up -d`
+- **THEN** the engine SHALL start with GPU access via ROCm device passthrough, bind-mount the data directory, and be reachable on the configured port
+
+#### Scenario: Automatic restart
+- **WHEN** the engine process crashes or the host reboots
+- **THEN** Docker SHALL automatically restart the container (restart policy `unless-stopped`)
+
+#### Scenario: Configure via environment
+- **WHEN** an admin sets environment variables in the compose file (KB_MODEL, KB_API_KEY, KB_DEVICE, etc.)
+- **THEN** the engine SHALL use those values
+
+#### Scenario: Pre-built image deployment
+- **WHEN** an admin wants to use a pre-built engine image without building from source
+- **THEN** the engine release notes SHALL include the exact `docker pull` command with the versioned tag (e.g. `docker.dcglab.co.uk/dcg/kb/engine:engine-v2.1.0-nvidia`)
@@ -0,0 +1,13 @@
+## MODIFIED Requirements
+
+### Requirement: Engine status and reindex
+
+The engine SHALL provide status information and support re-embedding all chunks. The `version` field in the status response SHALL always be present and SHALL reflect the engine's release version as read from the `VERSION` file. This field is the contract used by clients for compatibility checking.
+
+#### Scenario: Get engine status
+- **WHEN** a client sends `GET /api/v1/status`
+- **THEN** the engine SHALL return JSON with `version` (string, from VERSION file), model_name, embedding_dim, GPU device info, database stats (document count by type, total chunks, DB size), and queue stats (queued/processing job count)
+
+#### Scenario: Trigger reindex
+- **WHEN** a client sends `POST /api/v1/reindex`
+- **THEN** the engine SHALL re-embed all existing chunks using the currently loaded model and return progress information. This operation SHALL NOT block search queries.
@@ -0,0 +1,45 @@
+## ADDED Requirements
+
+### Requirement: Engine version compatibility check
+
+The client SHALL verify that the connected engine meets a minimum version requirement before executing any API command. The minimum required engine version SHALL be embedded in the client binary at build time. If the engine version is below the minimum, the client SHALL print an error message and exit with a non-zero code. There SHALL be no flag to skip or suppress this check.
+
+#### Scenario: Compatible engine version
+- **WHEN** the client connects to an engine reporting version `2.1.5` and `MinEngineVersion` is `2.1.0`
+- **THEN** the client SHALL proceed with the command normally
+
+#### Scenario: Incompatible engine version
+- **WHEN** the client connects to an engine reporting version `2.0.3` and `MinEngineVersion` is `2.1.0`
+- **THEN** the client SHALL print to stderr: `Error: kb client vX.Y.Z requires engine v2.1.0+ (connected engine is v2.0.3)` followed by an upgrade hint, and exit with code 1
+
+#### Scenario: Engine unreachable during version check
+- **WHEN** the client cannot reach the engine's `/api/v1/status` endpoint
+- **THEN** the client SHALL skip the version check and proceed with the original command (the actual API call will surface the connectivity error)
+
+#### Scenario: Version check is cached per session
+- **WHEN** the client has already verified engine compatibility during the current invocation
+- **THEN** subsequent API calls within the same invocation SHALL NOT repeat the version check
+
+#### Scenario: Client version command does not check engine
+- **WHEN** the user runs `kb --version`
+- **THEN** the client SHALL print the client version without contacting the engine
+
+#### Scenario: MinEngineVersion not set
+- **WHEN** the client binary has `MinEngineVersion` set to empty string or `dev`
+- **THEN** the client SHALL skip the version check entirely (development builds)
+
+---
+
+## MODIFIED Requirements
+
+### Requirement: Single static binary with zero runtime dependencies
+
+The Go client SHALL compile to a single static binary with no runtime dependencies. It SHALL support cross-compilation for Linux (amd64, arm64), macOS (amd64, arm64), and Windows (amd64). The build SHALL inject both `Version` and `MinEngineVersion` via ldflags.
+
+#### Scenario: Install on a clean machine
+- **WHEN** a user downloads the `kb` binary for their platform
+- **THEN** they SHALL be able to run it immediately with no additional installs (no Python, no Docker, no shared libraries)
+
+#### Scenario: Version and compatibility info embedded at build time
+- **WHEN** the client is built with `make all VERSION=2.1.0 MIN_ENGINE_VERSION=2.0.0`
+- **THEN** `kb --version` SHALL report `2.1.0` and the compatibility check SHALL use `2.0.0` as the minimum engine version
@@ -0,0 +1,35 @@
+## 1. Client Compatibility Check
+
+- [x] 1.1 Create `client/MIN_ENGINE_VERSION` file with initial value `2.0.0`
+- [x] 1.2 Add `MinEngineVersion` variable to `client/cmd/root.go` (set via ldflags, default `dev`)
+- [x] 1.3 Update `client/Makefile` to read `MIN_ENGINE_VERSION` file and inject via `-ldflags "-X cmd.MinEngineVersion=..."` alongside existing `Version`
+- [x] 1.4 Add `CheckEngineVersion(minVersion string)` method to `client/internal/api/client.go` that calls `GET /api/v1/status`, parses `version` field, and compares against `minVersion` using semver major.minor.patch
+- [x] 1.5 Add `versionChecked bool` field to `Client` struct; guard `CheckEngineVersion` so it runs at most once per Client instance
+- [x] 1.6 Call `CheckEngineVersion` at the start of `Client.do()` (before executing the actual request); skip if `MinEngineVersion` is empty or `dev`
+- [x] 1.7 On version mismatch: print `Error: kb client vX.Y.Z requires engine vM.N.P+ (connected engine is vA.B.C)\nUpdate your engine image to engine-vM.N.P or later.` to stderr and `os.Exit(1)`
+- [x] 1.8 On status endpoint unreachable: skip version check silently (let the actual request surface the error)
+
+## 2. Release Script — Client
+
+- [x] 2.1 Create `release-client.sh` extracting client-specific logic from `release.sh`: version bump of `client/VERSION`, Go binary build, git tag `client-vX.Y.Z`, Gitea release with binary assets
+- [x] 2.2 Release notes template: include `MinEngineVersion` requirement (e.g. "Requires engine v2.0.0+")
+- [x] 2.3 Pass `MIN_ENGINE_VERSION` to `make all` in the build step
+
+## 3. Release Script — Engine
+
+- [x] 3.1 Create `release-engine.sh` extracting engine-specific logic from `release.sh`: version bump of `engine/VERSION`, Docker image build (nvidia + rocm), git tag `engine-vX.Y.Z`, Gitea release, image push
+- [x] 3.2 Release notes template: include Docker pull commands with `engine-vX.Y.Z` prefixed tags
+
+## 4. Cleanup
+
+- [x] 4.1 Remove old `release.sh` (replaced by the two new scripts)
+- [x] 4.2 Update Docker image tag format in release scripts from `vX.Y.Z-nvidia` to `engine-vX.Y.Z-nvidia` (and same for rocm/latest)
+
+## 5. Testing
+
+- [x] 5.1 Test client version check passes when engine version >= MinEngineVersion
+- [x] 5.2 Test client version check fails with correct error message when engine version < MinEngineVersion
+- [x] 5.3 Test client skips version check when MinEngineVersion is empty or `dev`
+- [x] 5.4 Test client skips version check when engine is unreachable
+- [x] 5.5 Dry-run `release-client.sh --dry-run --gitea` and verify correct tag format and build
+- [x] 5.6 Dry-run `release-engine.sh --dry-run --gitea` and verify correct tag format and image names
@@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-03-29
@@ -0,0 +1,29 @@
+## Context
+
+The root cobra command in `client/cmd/root.go` uses `cobra.ArbitraryArgs` and its `RunE` handler to catch any arguments not matching a subcommand. Currently, any non-empty args are joined and submitted as a note. This means a single mistyped word (e.g., `kb infow` instead of `kb info`) silently creates a junk note in the knowledge base.
+
+## Goals / Non-Goals
+
+**Goals:**
+- Prevent single bare words from being silently ingested as notes
+- Provide a clear error message that helps the user correct their input
+- Preserve the multi-word implicit note shorthand (`kb remember to update dns`)
+
+**Non-Goals:**
+- Detecting "close matches" to real commands (fuzzy matching / did-you-mean)
+- Changing how quoted strings work at the shell level (we can't detect quotes after shell expansion)
+
+## Decisions
+
+### Guard on argument count in RunE
+
+When `len(args) == 1`, reject with an error message instead of submitting as a note. When `len(args) > 1`, continue treating as implicit note shorthand.
+
+**Rationale**: This is the simplest reliable heuristic. The shell strips quotes before cobra sees args, so we cannot distinguish `kb "singleword"` from `kb singleword`. However, single-word notes are rare in practice, and the error message tells the user how to work around it (use multiple words or the full note workflow). Multi-word input is almost certainly intentional note text, not a mistyped command.
+
+**Alternative considered**: Checking against a list of known subcommand names — rejected because it wouldn't catch typos of commands we don't know about and adds maintenance burden.
+
+## Risks / Trade-offs
+
+- **Single-word notes no longer work via shorthand** → Users must use `kb add --note "singleword"` or include additional words. This is an acceptable trade-off since single-word notes are uncommon and the error message is clear.
+- **Shell quote stripping means we can't be perfect** → `kb "my note"` with exactly one word after quote removal will be rejected. This is a known limitation but very rare in practice.
@@ -0,0 +1,24 @@
+## Why
+
+A single unquoted word passed to `kb` (e.g., `kb infow`) is silently treated as a note and ingested. This is almost always a mistyped command, not an intentional note. Users lose trust when typos pollute their knowledge base.
+
+## What Changes
+
+- The implicit note shorthand will require **more than one argument** to be treated as a note. A single bare word will be rejected with a helpful error suggesting the user check their command or quote a multi-word note.
+- This is a **BREAKING** change to the implicit note shorthand: `kb singleword` no longer creates a note. Users must write `kb "singleword is important"` or use multiple words.
+
+## Capabilities
+
+### New Capabilities
+
+_(none)_
+
+### Modified Capabilities
+
+- `go-client`: The "Implicit note shorthand" requirement changes to reject single-word bare arguments and print an error instead of submitting them as notes.
+
+## Impact
+
+- **Code**: `client/cmd/root.go` — `RunE` handler for the root command
+- **Tests**: `client/cmd/root_test.go` or equivalent — add/update tests for single-word rejection
+- **Users**: Anyone who intentionally used `kb singleword` as a note shorthand will need to use multiple words or quotes
@@ -0,0 +1,37 @@
+## MODIFIED Requirements
+
+### Requirement: Implicit note shorthand
+
+The client SHALL treat bare string arguments (with no subcommand) as an implicit note only when **more than one argument** is provided. `kb "my note"` SHALL behave identically to submitting a note via `POST /api/v1/jobs`. All persistent flags (`--format`, `--engine`, `--api-key`) and the root `--tags` flag SHALL work with the shorthand form. A single bare word SHALL be rejected with an error message.
+
+#### Scenario: Quick note via bare argument
+- **WHEN** the user runs `kb "remember to update DNS"`
+- **THEN** the client SHALL submit the text as a note via `POST /api/v1/jobs` and print `Queued: note`
+
+#### Scenario: Bare argument with tags
+- **WHEN** the user runs `kb "server room is building 3" --tags ops`
+- **THEN** the client SHALL submit the note with the specified tags
+
+#### Scenario: Bare argument with JSON output
+- **WHEN** the user runs `kb "my note" --format json`
+- **THEN** the client SHALL output the raw JSON response from the engine
+
+#### Scenario: Bare argument duplicate detection
+- **WHEN** the user runs `kb "my note"` and the engine returns HTTP 409
+- **THEN** the client SHALL handle the duplicate response identically to the previous `kb add --note` behaviour
+
+#### Scenario: Multiple unquoted words
+- **WHEN** the user runs `kb remember to update dns` (without quotes)
+- **THEN** the client SHALL join all arguments into a single note string and submit it
+
+#### Scenario: Single bare word rejected
+- **WHEN** the user runs `kb infow` (a single unrecognized word)
+- **THEN** the client SHALL print to stderr: `Unknown command "infow". Run 'kb --help' for available commands.` followed by a hint about note usage, and exit with a non-zero code
+
+#### Scenario: No interference with subcommands
+- **WHEN** the user runs `kb search "query"` or any other existing subcommand
+- **THEN** the client SHALL route to the subcommand as before — the implicit note shorthand SHALL NOT interfere
+
+#### Scenario: No arguments
+- **WHEN** the user runs `kb` with no arguments
+- **THEN** the client SHALL display the help text
@@ -0,0 +1,10 @@
+## 1. Core Implementation
+
+- [x] 1.1 Update `RunE` in `client/cmd/root.go` to reject single-word bare arguments with an error message and non-zero exit
+- [x] 1.2 Update usage template in `root.go` to reflect that note shorthand requires multiple words
+
+## 2. Tests
+
+- [x] 2.1 Add test: single bare word prints error to stderr and exits non-zero
+- [x] 2.2 Add test: multiple bare words are submitted as a note (existing behavior preserved)
+- [x] 2.3 Add test: zero arguments shows help (existing behavior preserved)
@@ -0,0 +1,81 @@
+# Chunk Enrichment
+
+## Purpose
+
+Chunk enrichment prepends document titles and section headers to chunk text before indexing and embedding, ensuring that document-level context participates in both full-text and semantic search.
+
+## Requirements
+
+### Requirement: Chunk text enrichment with document title
+
+The engine SHALL prepend the document title to each chunk's text before FTS indexing and vector embedding. The enriched text SHALL be stored in a dedicated `enriched_text` column on the `chunks` table. The original chunk text SHALL remain in the `text` column for display purposes.
+
+The enrichment format SHALL be:
+- Without section header: `"{title}\n\n{chunk_text}"`
+- With section header: `"{title} > {section_header}\n\n{chunk_text}"`
+
+Where `section_header` is the value from the chunk's metadata `section_header` field, when present.
+
+#### Scenario: Note ingestion with title enrichment
+- **WHEN** a note titled "Suitcase Locks" with content "Steve = 363" is ingested
+- **THEN** the `chunks.text` column SHALL contain "Steve = 363" and the `chunks.enriched_text` column SHALL contain "Suitcase Locks\n\nSteve = 363"
+
+#### Scenario: Markdown chunk with section header enrichment
+- **WHEN** a markdown document titled "DCG Lab Hardware" produces a chunk with section_header "GRIMDAWN > motherboard" and text "MSI X870 Tomahawk"
+- **THEN** the `chunks.enriched_text` SHALL contain "DCG Lab Hardware > GRIMDAWN > motherboard\n\nMSI X870 Tomahawk"
+
+#### Scenario: Chunk without section header
+- **WHEN** a document titled "Docker Tips" produces a chunk with no section_header in metadata and text "dbash() { docker exec -it $1 bash; }"
+- **THEN** the `chunks.enriched_text` SHALL contain "Docker Tips\n\ndbash() { docker exec -it $1 bash; }"
+
+---
+
+### Requirement: FTS5 indexes enriched text
+
+The FTS5 virtual table `chunks_fts` SHALL index the `enriched_text` column instead of the `text` column. All FTS sync triggers (insert, update, delete) SHALL operate on `enriched_text`.
+
+#### Scenario: FTS search matches document title
+- **WHEN** a user searches for "suitcase locks" and a document titled "Suitcase Locks" exists with chunk text "Steve = 363"
+- **THEN** the FTS5 search SHALL return that chunk as a match
+
+#### Scenario: FTS search still matches chunk content
+- **WHEN** a user searches for "MSI X870" and a chunk contains that text in its body
+- **THEN** the FTS5 search SHALL return that chunk as a match (enrichment does not break content matching)
+
+---
+
+### Requirement: Vector embeddings use enriched text
+
+The embedding model SHALL receive `enriched_text` (not raw `text`) when generating vectors during both initial ingestion and reindex operations.
+
+#### Scenario: Vector search matches document title
+- **WHEN** a user searches semantically for "luggage combination codes" and a document titled "Suitcase Locks" exists
+- **THEN** the vector search SHALL return that chunk with higher similarity than it would without title enrichment
+
+#### Scenario: Reindex uses enriched text
+- **WHEN** `POST /api/v1/reindex` is called
+- **THEN** the engine SHALL read `enriched_text` from the chunks table and embed that (not `text`)
+
+---
+
+### Requirement: Schema migration adds enriched_text column
+
+On startup, `init_schema` SHALL add the `enriched_text` column to the `chunks` table if it does not exist. It SHALL then backfill `enriched_text` for all existing chunks by joining with `documents.title` and parsing chunk metadata for section headers. It SHALL rebuild the FTS5 table and triggers to index `enriched_text`.
+
+#### Scenario: First startup after upgrade
+- **WHEN** the engine starts and `chunks.enriched_text` column does not exist
+- **THEN** the engine SHALL add the column, backfill all rows, drop and recreate `chunks_fts` to index `enriched_text`, and recreate the FTS sync triggers
+
+#### Scenario: Subsequent startup
+- **WHEN** the engine starts and `chunks.enriched_text` column already exists
+- **THEN** the engine SHALL not perform any migration and start normally
+
+---
+
+### Requirement: Search results return raw text
+
+Search results SHALL continue to return the original chunk text (from `chunks.text`) in the `text` field, not the enriched text. The document title is already returned as a separate `title` field.
+
+#### Scenario: Search result text field
+- **WHEN** a search returns a chunk from document "Suitcase Locks" with raw text "Steve = 363"
+- **THEN** the result `text` field SHALL be "Steve = 363" (not "Suitcase Locks\n\nSteve = 363")
@@ -67,7 +67,7 @@ The engine SHALL store all persistent state (SQLite database, HF model cache, st

 ### Requirement: Compose files for deployment

-The project SHALL provide Docker Compose files for single-command deployment.
+The project SHALL provide Docker Compose files for single-command deployment. Compose files SHALL use `build:` context for local development. Release notes SHALL document the versioned image tag for users pulling pre-built images.

 #### Scenario: Start NVIDIA deployment
 - **WHEN** an admin runs `docker compose -f compose.nvidia.yaml up -d`
@@ -85,6 +85,10 @@ The project SHALL provide Docker Compose files for single-command deployment.
 - **WHEN** an admin sets environment variables in the compose file (KB_MODEL, KB_API_KEY, KB_DEVICE, etc.)
 - **THEN** the engine SHALL use those values

+#### Scenario: Pre-built image deployment
+- **WHEN** an admin wants to use a pre-built engine image without building from source
+- **THEN** the engine release notes SHALL include the exact `docker pull` command with the versioned tag (e.g. `docker.dcglab.co.uk/dcg/kb/engine:engine-v2.1.0-nvidia`)
+
 ---

 ### Requirement: CPU-only fallback
@@ -0,0 +1,89 @@
+# Document Storage
+
+## Purpose
+
+Persistent storage, retrieval, and lifecycle management of original uploaded document files.
+
+## Requirements
+
+### Requirement: Persistent original file storage
+
+The engine SHALL persistently store the original uploaded file on disk after successful ingestion. Files SHALL be stored at `{data_dir}/documents/{content_hash}{extension}` where `content_hash` is the SHA-256 hex digest already computed for dedup and `extension` is preserved from the original filename. The `documents` table SHALL record the stored file path in a `stored_path` column and the original upload filename in an `original_filename` column.
+
+#### Scenario: File stored after successful ingestion
+- **WHEN** the background worker successfully processes an ingestion job for a PDF file
+- **THEN** the worker SHALL move the staged file to `{data_dir}/documents/{content_hash}.pdf`, store the permanent path in `documents.stored_path`, store the original filename in `documents.original_filename`, and delete the staging entry
+
+#### Scenario: Note stored after successful ingestion
+- **WHEN** the background worker successfully processes an ingestion job for a text note
+- **THEN** the worker SHALL move the staged `.note` file to `{data_dir}/documents/{content_hash}.note` and store the permanent path in `documents.stored_path`
+
+#### Scenario: Markdown file stored after successful ingestion
+- **WHEN** the background worker successfully processes an ingestion job for a markdown file
+- **THEN** the worker SHALL move the staged file to `{data_dir}/documents/{content_hash}.md` and store the permanent path in `documents.stored_path`
+
+#### Scenario: Code file stored after successful ingestion
+- **WHEN** the background worker successfully processes an ingestion job for a code file (e.g. `.py`, `.go`)
+- **THEN** the worker SHALL move the staged file to `{data_dir}/documents/{content_hash}{original_extension}` and store the permanent path in `documents.stored_path`
+
+#### Scenario: Documents directory created at startup
+- **WHEN** the engine starts up and calls `ensure_dirs()`
+- **THEN** the `{data_dir}/documents/` directory SHALL be created if it does not exist
+
+#### Scenario: Ingestion failure does not store file
+- **WHEN** the background worker fails to process an ingestion job
+- **THEN** the staged file SHALL be cleaned up as before and no file SHALL be written to the documents directory
+
+---
+
+### Requirement: File retrieval via API
+
+The engine SHALL serve the original stored file for any document that has a stored file on disk.
+
+#### Scenario: Download original file
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file` for a document with a stored file
+- **THEN** the engine SHALL return the file with appropriate `Content-Type` based on file extension and `Content-Disposition: attachment; filename="{original_filename}"` header, falling back to `{title}{ext}` if `original_filename` is NULL
+
+#### Scenario: Download file for pre-existing document
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file` for a document ingested before this feature was added (stored_path is NULL)
+- **THEN** the engine SHALL return HTTP 404 with `{"error": "Original file not available - ingested before document storage was enabled"}`
+
+#### Scenario: Download file when file missing from disk
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file` for a document whose `stored_path` is set but the file no longer exists on disk
+- **THEN** the engine SHALL return HTTP 404 with `{"error": "Stored file not found on disk"}`
+
+#### Scenario: Download file for non-existent document
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file` with a non-existent document ID
+- **THEN** the engine SHALL return HTTP 404 with `{"error": "Document not found"}`
+
+---
+
+### Requirement: File cleanup on document deletion
+
+The engine SHALL remove the stored original file from disk when a document is deleted.
+
+#### Scenario: Delete document with stored file
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}` for a document with a stored file
+- **THEN** the engine SHALL delete the document from the database (cascading to chunks, embeddings, tags) AND delete the stored file from disk
+
+#### Scenario: Delete document when stored file already missing
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}` for a document whose stored file has been manually removed from disk
+- **THEN** the engine SHALL delete the document from the database successfully and log a warning about the missing file
+
+#### Scenario: Delete document without stored file (pre-existing)
+- **WHEN** a client sends `DELETE /api/v1/documents/{id}` for a document with `stored_path` NULL
+- **THEN** the engine SHALL delete the document from the database without attempting file removal
+
+---
+
+### Requirement: Database schema migration for stored_path and original_filename
+
+The engine SHALL add `stored_path` and `original_filename` columns to the `documents` table for tracking permanent file locations and original upload filenames.
+
+#### Scenario: Fresh database initialization
+- **WHEN** the engine initializes a new database
+- **THEN** the `documents` table SHALL include `stored_path TEXT` and `original_filename TEXT` columns in its schema
+
+#### Scenario: Existing database migration
+- **WHEN** the engine starts with a database created before this feature
+- **THEN** the engine SHALL add `stored_path TEXT` and `original_filename TEXT` to the `documents` table via `ALTER TABLE` if the columns do not exist
@@ -128,11 +128,11 @@ The engine SHALL maintain job records in SQLite with status tracking. Jobs SHALL

 ### Requirement: Background ingestion worker

-The engine SHALL run a background worker that processes queued jobs. The worker SHALL process one job at a time. For each job, it SHALL: detect document type, run the appropriate chunking pipeline (Docling for PDFs, header-based for Markdown, AST-based for code, whole-text for notes), generate embeddings using the resident model, and insert chunks and vectors into the database.
+The engine SHALL run a background worker that processes queued jobs. The worker SHALL process one job at a time. For each job, it SHALL: detect document type, run the appropriate chunking pipeline (Docling for PDFs, header-based for Markdown, AST-based for code, whole-text for notes), build enriched text by prepending the document title (and section header when present) to each chunk's text, generate embeddings using the enriched text and the resident model, insert chunks (with both raw text and enriched text) and vectors into the database, and move the original file to persistent storage.

 #### Scenario: Successful PDF ingestion
 - **WHEN** the background worker picks up a queued PDF job
- **THEN** it SHALL update the job status to `processing`, run Docling conversion and chunking, embed all chunks, insert document and chunks into the database, update the job status to `done` with the resulting document_id and chunk count, and delete the staged file
+- **THEN** it SHALL update the job status to `processing`, run Docling conversion and chunking, build enriched text for each chunk by prepending the document title, embed all chunks using enriched text, insert document and chunks into the database, move the staged file to `{data_dir}/documents/{content_hash}.pdf`, update `documents.stored_path` with the permanent path, store the original filename in `documents.original_filename`, update the job status to `done` with the resulting document_id and chunk count, and clean up the staging entry

 #### Scenario: Ingestion failure
 - **WHEN** the background worker encounters an error during processing (e.g., corrupt PDF)
@@ -146,7 +146,7 @@ The engine SHALL run a background worker that processes queued jobs. The worker

 ### Requirement: Document management

-The engine SHALL provide endpoints to list, inspect, and remove ingested documents.
+The engine SHALL provide endpoints to list, inspect, remove, and download original files for ingested documents.

 #### Scenario: List documents
 - **WHEN** a client sends `GET /api/v1/documents`
@@ -158,11 +158,15 @@ The engine SHALL provide endpoints to list, inspect, and remove ingested documen

 #### Scenario: Get document details
 - **WHEN** a client sends `GET /api/v1/documents/{id}`
- **THEN** the engine SHALL return the full document record including all chunks and their text content
+- **THEN** the engine SHALL return the full document record including all chunks, their text content, and whether the original file is available (`has_file: true/false`)
+
+#### Scenario: Download original file
+- **WHEN** a client sends `GET /api/v1/documents/{id}/file`
+- **THEN** the engine SHALL return the original file with appropriate Content-Type and `Content-Disposition: attachment; filename="{original_filename}"` headers, or HTTP 404 if the file is not available

 #### Scenario: Remove a document
 - **WHEN** a client sends `DELETE /api/v1/documents/{id}`
- **THEN** the engine SHALL delete the document, all its chunks, associated embeddings, and tag associations, and return HTTP 200 with a confirmation
+- **THEN** the engine SHALL delete the document, all its chunks, associated embeddings, tag associations, and the stored original file from disk, and return HTTP 200 with a confirmation

 #### Scenario: Remove non-existent document
 - **WHEN** a client sends `DELETE /api/v1/documents/{id}` with a non-existent ID
@@ -190,15 +194,15 @@ The engine SHALL provide endpoints to list all tags and manage tags on documents

 ### Requirement: Engine status and reindex

-The engine SHALL provide status information and support re-embedding all chunks.
+The engine SHALL provide status information and support re-embedding all chunks. The `version` field in the status response SHALL always be present and SHALL reflect the engine's release version as read from the `VERSION` file. This field is the contract used by clients for compatibility checking.

 #### Scenario: Get engine status
 - **WHEN** a client sends `GET /api/v1/status`
- **THEN** the engine SHALL return JSON with model_name, embedding_dim, GPU device info, database stats (document count by type, total chunks, DB size), and queue stats (queued/processing job count)
+- **THEN** the engine SHALL return JSON with `version` (string, from VERSION file), model_name, embedding_dim, GPU device info, database stats (document count by type, total chunks, DB size), and queue stats (queued/processing job count)

 #### Scenario: Trigger reindex
 - **WHEN** a client sends `POST /api/v1/reindex`
- **THEN** the engine SHALL re-embed all existing chunks using the currently loaded model and return progress information. This operation SHALL NOT block search queries.
+- **THEN** the engine SHALL re-embed all existing chunks using the `enriched_text` column and the currently loaded model, and return progress information. This operation SHALL NOT block search queries.

 ---

@@ -230,7 +234,7 @@ The engine SHALL be configured via environment variables. No config file is read

 #### Scenario: Default configuration
 - **WHEN** the engine starts with no environment variables set
- **THEN** it SHALL use defaults: data directory `/data`, model `all-MiniLM-L6-v2`, device `auto`, no API key required
+- **THEN** it SHALL use defaults: data directory `/data`, model `all-MiniLM-L6-v2`, device `auto`, no API key required. It SHALL create `staging/` and `documents/` subdirectories under the data directory.

 #### Scenario: Custom model
 - **WHEN** `KB_MODEL` is set to `BAAI/bge-small-en-v1.5`
@@ -8,12 +8,16 @@ The Go client (`kb`) provides a command-line interface for interacting with the

 ### Requirement: Single static binary with zero runtime dependencies

-The Go client SHALL compile to a single static binary with no runtime dependencies. It SHALL support cross-compilation for Linux (amd64, arm64), macOS (amd64, arm64), and Windows (amd64).
+The Go client SHALL compile to a single static binary with no runtime dependencies. It SHALL support cross-compilation for Linux (amd64, arm64), macOS (amd64, arm64), and Windows (amd64). The build SHALL inject both `Version` and `MinEngineVersion` via ldflags.

 #### Scenario: Install on a clean machine
 - **WHEN** a user downloads the `kb` binary for their platform
 - **THEN** they SHALL be able to run it immediately with no additional installs (no Python, no Docker, no shared libraries)

+#### Scenario: Version and compatibility info embedded at build time
+- **WHEN** the client is built with `make all VERSION=2.1.0 MIN_ENGINE_VERSION=2.0.0`
+- **THEN** `kb --version` SHALL report `2.1.0` and the compatibility check SHALL use `2.0.0` as the minimum engine version
+
 ---

 ### Requirement: Client configuration
@@ -64,48 +68,86 @@ The client SHALL provide a `kb search <query>` command that sends the query to t

 ---

+### Requirement: Implicit note shorthand
+
+The client SHALL treat bare string arguments (with no subcommand) as an implicit note only when **more than one argument** is provided. `kb "my note"` SHALL behave identically to submitting a note via `POST /api/v1/jobs`. All persistent flags (`--format`, `--engine`, `--api-key`) and the root `--tags` flag SHALL work with the shorthand form. A single bare word SHALL be rejected with an error message.
+
+#### Scenario: Quick note via bare argument
+- **WHEN** the user runs `kb "remember to update DNS"`
+- **THEN** the client SHALL submit the text as a note via `POST /api/v1/jobs` and print `Queued: note`
+
+#### Scenario: Bare argument with tags
+- **WHEN** the user runs `kb "server room is building 3" --tags ops`
+- **THEN** the client SHALL submit the note with the specified tags
+
+#### Scenario: Bare argument with JSON output
+- **WHEN** the user runs `kb "my note" --format json`
+- **THEN** the client SHALL output the raw JSON response from the engine
+
+#### Scenario: Bare argument duplicate detection
+- **WHEN** the user runs `kb "my note"` and the engine returns HTTP 409
+- **THEN** the client SHALL handle the duplicate response identically to the previous `kb add --note` behaviour
+
+#### Scenario: Multiple unquoted words
+- **WHEN** the user runs `kb remember to update dns` (without quotes)
+- **THEN** the client SHALL join all arguments into a single note string and submit it
+
+#### Scenario: Single bare word rejected
+- **WHEN** the user runs `kb infow` (a single unrecognized word)
+- **THEN** the client SHALL print to stderr: `Unknown command "infow". Run 'kb --help' for available commands.` followed by a hint about note usage, and exit with a non-zero code
+
+#### Scenario: No interference with subcommands
+- **WHEN** the user runs `kb search "query"` or any other existing subcommand
+- **THEN** the client SHALL route to the subcommand as before — the implicit note shorthand SHALL NOT interfere
+
+#### Scenario: No arguments
+- **WHEN** the user runs `kb` with no arguments
+- **THEN** the client SHALL display the help text
+
+---
+
 ### Requirement: Add command (file and note ingestion)

-The client SHALL provide a `kb add` command that uploads files or notes to the engine for async ingestion. The client SHALL exit immediately after a successful upload. The client SHALL handle duplicate rejection (HTTP 409) and display the existing document information.
+The client SHALL provide a `kb addfile` command that uploads files to the engine for async ingestion. The command SHALL validate file extensions before uploading and reject unsupported types. The client SHALL handle duplicate rejection (HTTP 409) and display the existing document information. The command SHALL NOT handle notes — notes are submitted via the implicit note shorthand (`kb "text"`).

 #### Scenario: Add a single file
- **WHEN** the user runs `kb add report.pdf`
- **THEN** the client SHALL upload the file via `POST /api/v1/jobs` (multipart), print "Queued: report.pdf", and exit
+- **WHEN** the user runs `kb addfile report.pdf`
+- **THEN** the client SHALL validate the file extension, upload the file via `POST /api/v1/jobs` (multipart), print "Queued: report.pdf", and exit

 #### Scenario: Add a file with tags
- **WHEN** the user runs `kb add manual.pdf --tags car,maintenance`
+- **WHEN** the user runs `kb addfile manual.pdf --tags car,maintenance`
 - **THEN** the client SHALL include the tags in the multipart upload metadata

 #### Scenario: Add a directory recursively
- **WHEN** the user runs `kb add ~/documents/ --recursive`
+- **WHEN** the user runs `kb addfile ~/documents/ --recursive`
 - **THEN** the client SHALL discover all supported files in the directory tree, upload each one sequentially, and print "Queued: N files"

-#### Scenario: Add a text note
- **WHEN** the user runs `kb add --note "The server room is in building 3, floor 2"`
- **THEN** the client SHALL submit the note text via `POST /api/v1/jobs` (multipart with note field), print "Queued: note", and exit
+#### Scenario: Unsupported file extension
+- **WHEN** the user runs `kb addfile photo.jpg`
+- **THEN** the client SHALL print an error listing supported extensions and exit with a non-zero code without making any API call

 #### Scenario: Duplicate file rejected (already ingested)
- **WHEN** the user runs `kb add report.pdf` and the engine returns HTTP 409 with `{"error": "duplicate", "document_id": 42, "title": "report.pdf"}`
+- **WHEN** the user runs `kb addfile report.pdf` and the engine returns HTTP 409 with `{"error": "duplicate", "document_id": 42, "title": "report.pdf"}`
 - **THEN** the client SHALL print "Already imported: report.pdf (doc ID: 42)" and exit with code 0

 #### Scenario: Duplicate file rejected (in-flight job)
- **WHEN** the user runs `kb add report.pdf` and the engine returns HTTP 409 with `{"error": "duplicate", "job_id": 7, "title": "report.pdf"}`
+- **WHEN** the user runs `kb addfile report.pdf` and the engine returns HTTP 409 with `{"error": "duplicate", "job_id": 7, "title": "report.pdf"}`
 - **THEN** the client SHALL print "Already queued: report.pdf (job ID: 7)" and exit with code 0

 #### Scenario: Duplicate file in recursive add
- **WHEN** the user runs `kb add ~/documents/ --recursive` and some files are rejected as duplicates
- **THEN** the client SHALL print the duplicate message for each rejected file (distinguishing "Already imported" from "Already queued"), continue uploading remaining files, and include a summary (e.g., "Queued: 5 files, 2 duplicates skipped")
+- **WHEN** the user runs `kb addfile ~/documents/ --recursive` and some files are rejected as duplicates
+- **THEN** the client SHALL print the duplicate message for each rejected file, continue uploading remaining files, and include a summary (e.g., "Queued: 5 files, 2 duplicates skipped")

 #### Scenario: Duplicate with JSON output
- **WHEN** the user runs `kb add report.pdf --format json` and the engine returns HTTP 409
+- **WHEN** the user runs `kb addfile report.pdf --format json` and the engine returns HTTP 409
 - **THEN** the client SHALL output the raw JSON response from the engine including the document_id and title

 #### Scenario: Add with JSON output
- **WHEN** the user runs `kb add report.pdf --format json`
+- **WHEN** the user runs `kb addfile report.pdf --format json`
 - **THEN** the client SHALL output the JSON response from the engine including the job_id

 #### Scenario: File not found
- **WHEN** the user runs `kb add nonexistent.pdf`
+- **WHEN** the user runs `kb addfile nonexistent.pdf`
 - **THEN** the client SHALL print an error and exit with a non-zero code without making any API call

 #### Scenario: Upload failure
@@ -186,6 +228,62 @@ The client SHALL provide a `kb status` command to display engine status.

 ---

+### Requirement: Reindex command
+
+The client SHALL provide a `kb reindex` command that triggers re-embedding of all chunks on the engine. The command SHALL prompt for confirmation before proceeding.
+
+#### Scenario: Reindex with confirmation
+- **WHEN** the user runs `kb reindex`
+- **THEN** the client SHALL display a warning that all chunks will be re-embedded and prompt `Reindex all chunks? This will re-embed everything. [y/N]`. If confirmed, it SHALL POST to `/api/v1/reindex` and display the result.
+
+#### Scenario: Reindex with skip confirmation
+- **WHEN** the user runs `kb reindex --yes`
+- **THEN** the client SHALL skip the confirmation prompt and POST to `/api/v1/reindex` immediately
+
+#### Scenario: Reindex cancelled
+- **WHEN** the user runs `kb reindex` and responds with anything other than `y` or `yes`
+- **THEN** the client SHALL print `Cancelled.` and exit with code 0
+
+#### Scenario: Reindex human output
+- **WHEN** the reindex completes successfully with default format
+- **THEN** the client SHALL print `Reindexed N chunks (model: <model_name>)`
+
+#### Scenario: Reindex JSON output
+- **WHEN** the user runs `kb reindex --yes --format json`
+- **THEN** the client SHALL output the raw JSON response from the engine
+
+---
+
+### Requirement: Engine version compatibility check
+
+The client SHALL verify that the connected engine meets a minimum version requirement before executing any API command. The minimum required engine version SHALL be embedded in the client binary at build time. If the engine version is below the minimum, the client SHALL print an error message and exit with a non-zero code. There SHALL be no flag to skip or suppress this check.
+
+#### Scenario: Compatible engine version
+- **WHEN** the client connects to an engine reporting version `2.1.5` and `MinEngineVersion` is `2.1.0`
+- **THEN** the client SHALL proceed with the command normally
+
+#### Scenario: Incompatible engine version
+- **WHEN** the client connects to an engine reporting version `2.0.3` and `MinEngineVersion` is `2.1.0`
+- **THEN** the client SHALL print to stderr: `Error: kb client vX.Y.Z requires engine v2.1.0+ (connected engine is v2.0.3)` followed by an upgrade hint, and exit with code 1
+
+#### Scenario: Engine unreachable during version check
+- **WHEN** the client cannot reach the engine's `/api/v1/status` endpoint
+- **THEN** the client SHALL skip the version check and proceed with the original command (the actual API call will surface the connectivity error)
+
+#### Scenario: Version check is cached per session
+- **WHEN** the client has already verified engine compatibility during the current invocation
+- **THEN** subsequent API calls within the same invocation SHALL NOT repeat the version check
+
+#### Scenario: Client version command does not check engine
+- **WHEN** the user runs `kb --version`
+- **THEN** the client SHALL print the client version without contacting the engine
+
+#### Scenario: MinEngineVersion not set
+- **WHEN** the client binary has `MinEngineVersion` set to empty string or `dev`
+- **THEN** the client SHALL skip the version check entirely (development builds)
+
+---
+
 ### Requirement: Global output format flag

 All commands SHALL support a `--format` flag accepting `human` (default) or `json`. The default MAY be changed via the `default_format` config value.
@@ -0,0 +1,218 @@
+#!/usr/bin/env bash
+#
+# release-client.sh — Build, tag, and release the Go client
+#
+# Usage:
+#   ./release-client.sh --gitea|--github [--dry-run] [--no-increment] [--patch|--minor|--major]
+
+set -euo pipefail
+
+#──────────────────────────────────────────────────────────────────────
+# Config
+#──────────────────────────────────────────────────────────────────────
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+CLIENT_DIR="$SCRIPT_DIR/client"
+VERSION_FILE="$CLIENT_DIR/VERSION"
+MIN_ENGINE_FILE="$CLIENT_DIR/MIN_ENGINE_VERSION"
+
+#──────────────────────────────────────────────────────────────────────
+# Parse args
+#──────────────────────────────────────────────────────────────────────
+DRY_RUN=false
+INCREMENT=true
+BUMP="patch"
+FORGE=""
+
+for arg in "$@"; do
+    case "$arg" in
+        --dry-run)       DRY_RUN=true ;;
+        --no-increment)  INCREMENT=false ;;
+        --minor)         BUMP="minor" ;;
+        --major)         BUMP="major" ;;
+        --patch)         BUMP="patch" ;;
+        --gitea)         FORGE="tea" ;;
+        --github)        FORGE="gh" ;;
+        *)
+            echo "Unknown argument: $arg"
+            echo "Usage: $0 --gitea|--github [--dry-run] [--no-increment] [--patch|--minor|--major]"
+            exit 1
+            ;;
+    esac
+done
+
+if [[ -z "$FORGE" ]]; then
+    echo "Error: specify --gitea or --github"
+    echo "Usage: $0 --gitea|--github [--dry-run] [--no-increment] [--patch|--minor|--major]"
+    exit 1
+fi
+
+# Ensure we're on main branch
+CURRENT_BRANCH="$(git -C "$SCRIPT_DIR" rev-parse --abbrev-ref HEAD)"
+if [[ "$CURRENT_BRANCH" != "main" ]]; then
+    echo "Error: releases must be made from the main branch (currently on '$CURRENT_BRANCH')"
+    exit 1
+fi
+
+if ! command -v "$FORGE" &>/dev/null; then
+    echo "Error: '$FORGE' not found in PATH"
+    exit 1
+fi
+
+#──────────────────────────────────────────────────────────────────────
+# Version helpers
+#──────────────────────────────────────────────────────────────────────
+read_version() {
+    local file="$1"
+    if [[ ! -f "$file" ]]; then
+        echo "Error: version file not found: $file" >&2
+        exit 1
+    fi
+    tr -d '[:space:]' < "$file"
+}
+
+bump_version() {
+    local ver="$1" part="$2"
+    local major minor patch
+    IFS='.' read -r major minor patch <<< "$ver"
+
+    case "$part" in
+        major) echo "$((major + 1)).0.0" ;;
+        minor) echo "${major}.$((minor + 1)).0" ;;
+        patch) echo "${major}.${minor}.$((patch + 1))" ;;
+    esac
+}
+
+write_version() {
+    local file="$1" ver="$2"
+    echo "$ver" > "$file"
+}
+
+run() {
+    echo "  $ $*"
+    if [[ "$DRY_RUN" == false ]]; then
+        "$@"
+    fi
+}
+
+#──────────────────────────────────────────────────────────────────────
+# Determine release version
+#──────────────────────────────────────────────────────────────────────
+CURRENT_VERSION="$(read_version "$VERSION_FILE")"
+MIN_ENGINE_VERSION="$(read_version "$MIN_ENGINE_FILE")"
+
+if [[ "$INCREMENT" == true ]]; then
+    VERSION="$(bump_version "$CURRENT_VERSION" "$BUMP")"
+    echo "==> Client version bump: $CURRENT_VERSION → $VERSION ($BUMP)"
+else
+    VERSION="$CURRENT_VERSION"
+    echo "==> Client version: $VERSION (no increment)"
+fi
+
+TAG="client-v${VERSION}"
+
+echo "    Tag:              $TAG"
+echo "    Min engine:       v$MIN_ENGINE_VERSION"
+echo "    Forge CLI:        $FORGE"
+echo "    Dry run:          $DRY_RUN"
+echo ""
+
+#──────────────────────────────────────────────────────────────────────
+# 1. Pre-flight checks
+#──────────────────────────────────────────────────────────────────────
+echo "==> Pre-flight checks"
+
+if [[ "$DRY_RUN" == false ]]; then
+    if git -C "$SCRIPT_DIR" rev-parse "$TAG" &>/dev/null; then
+        echo "Error: tag $TAG already exists"
+        exit 1
+    fi
+fi
+
+echo "    OK"
+echo ""
+
+#──────────────────────────────────────────────────────────────────────
+# 2. Update version file
+#──────────────────────────────────────────────────────────────────────
+if [[ "$INCREMENT" == true ]]; then
+    echo "==> Updating client version to $VERSION"
+    run write_version "$VERSION_FILE" "$VERSION"
+    echo ""
+fi
+
+#──────────────────────────────────────────────────────────────────────
+# 3. Build Go client binaries
+#──────────────────────────────────────────────────────────────────────
+echo "==> Building Go client binaries ($VERSION, min engine $MIN_ENGINE_VERSION)"
+
+run make -C "$CLIENT_DIR" clean
+run make -C "$CLIENT_DIR" all VERSION="$VERSION" MIN_ENGINE_VERSION="$MIN_ENGINE_VERSION"
+
+# Collect release assets
+ASSETS=()
+if [[ "$DRY_RUN" == false ]]; then
+    for bin in "$CLIENT_DIR"/dist/kb-*; do
+        ASSETS+=("$bin")
+    done
+    echo "    Built ${#ASSETS[@]} binaries"
+else
+    echo "    (skipped — dry run)"
+fi
+echo ""
+
+#──────────────────────────────────────────────────────────────────────
+# 4. Commit, tag, and push
+#──────────────────────────────────────────────────────────────────────
+echo "==> Committing and tagging $TAG"
+
+if [[ "$INCREMENT" == true ]]; then
+    run git -C "$SCRIPT_DIR" add "$VERSION_FILE"
+    run git -C "$SCRIPT_DIR" commit -m "Bump client version to $VERSION"
+fi
+
+run git -C "$SCRIPT_DIR" tag -a "$TAG" -m "Release $TAG"
+run git -C "$SCRIPT_DIR" push origin HEAD
+run git -C "$SCRIPT_DIR" push origin "$TAG"
+
+echo ""
+
+#──────────────────────────────────────────────────────────────────────
+# 5. Create release with assets
+#──────────────────────────────────────────────────────────────────────
+echo "==> Creating release via $FORGE"
+
+RELEASE_TITLE="Client $TAG"
+RELEASE_NOTES="## Go client v${VERSION}
+
+Requires engine v${MIN_ENGINE_VERSION}+
+
+## Client binaries
+
+Download the binary for your platform from the assets below, rename to \`kb\`, and place on your PATH."
+
+if [[ "$FORGE" == "gh" ]]; then
+    ASSET_FLAGS=()
+    for f in "${ASSETS[@]+"${ASSETS[@]}"}"; do
+        ASSET_FLAGS+=("$f")
+    done
+    run gh release create "$TAG" \
+        --title "$RELEASE_TITLE" \
+        --notes "$RELEASE_NOTES" \
+        "${ASSET_FLAGS[@]+"${ASSET_FLAGS[@]}"}"
+
+elif [[ "$FORGE" == "tea" ]]; then
+    run tea release create \
+        --tag "$TAG" \
+        --title "$RELEASE_TITLE" \
+        --note "$RELEASE_NOTES"
+
+    for f in "${ASSETS[@]+"${ASSETS[@]}"}"; do
+        run tea release asset create "$TAG" "$f"
+    done
+fi
+
+echo ""
+echo "==> Release $TAG complete!"
+echo ""
+echo "    Binaries: ${#ASSETS[@]} platform(s) attached to release"
+echo "    Min engine: v$MIN_ENGINE_VERSION"
@@ -1,21 +1,9 @@
 #!/usr/bin/env bash
 #
-# release.sh — Build, tag, and release kb-search
-#
-# Builds Go client binaries, Docker engine images, creates a Git tag + release,
-# and pushes container images to the registry.
+# release-engine.sh — Build, tag, and release the engine Docker images
 #
 # Usage:
-#   ./release.sh                    # auto-increment patch, build, release
-#   ./release.sh --no-increment     # release using current VERSION files as-is
-#   ./release.sh --dry-run          # show what would happen without doing it
-#   ./release.sh --minor            # bump minor version (e.g. 2.0.1 → 2.1.0)
-#   ./release.sh --major            # bump major version (e.g. 2.1.0 → 3.0.0)
-#   ./release.sh --gitea            # use Gitea (tea) for release creation
-#   ./release.sh --github           # use GitHub (gh) for release creation
-#
-# One of --gitea or --github is required.
-# Assumes Docker is already authenticated to the registry.
+#   ./release-engine.sh --gitea|--github [--dry-run] [--no-increment] [--patch|--minor|--major]

 set -euo pipefail

@@ -23,11 +11,8 @@ set -euo pipefail
 # Config
 #──────────────────────────────────────────────────────────────────────
 SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
-CLIENT_DIR="$SCRIPT_DIR/client"
 ENGINE_DIR="$SCRIPT_DIR/engine"
-
-CLIENT_VERSION_FILE="$CLIENT_DIR/VERSION"
-ENGINE_VERSION_FILE="$ENGINE_DIR/VERSION"
+VERSION_FILE="$ENGINE_DIR/VERSION"

 # Container registry
 REGISTRY="${REGISTRY:-docker.dcglab.co.uk}"
@@ -106,27 +91,6 @@ write_version() {
    echo "$ver" > "$file"
 }

-#──────────────────────────────────────────────────────────────────────
-# Determine release version
-#──────────────────────────────────────────────────────────────────────
-CURRENT_VERSION="$(read_version "$CLIENT_VERSION_FILE")"
-
-if [[ "$INCREMENT" == true ]]; then
-    VERSION="$(bump_version "$CURRENT_VERSION" "$BUMP")"
-    echo "==> Version bump: $CURRENT_VERSION → $VERSION ($BUMP)"
-else
-    VERSION="$CURRENT_VERSION"
-    echo "==> Version: $VERSION (no increment)"
-fi
-
-TAG="v${VERSION}"
-
-echo "    Tag:       $TAG"
-echo "    Registry:  $IMAGE_BASE"
-echo "    Forge CLI: $FORGE"
-echo "    Dry run:   $DRY_RUN"
-echo ""
-
 run() {
    echo "  $ $*"
    if [[ "$DRY_RUN" == false ]]; then
@@ -134,13 +98,33 @@ run() {
    fi
 }

+#──────────────────────────────────────────────────────────────────────
+# Determine release version
+#──────────────────────────────────────────────────────────────────────
+CURRENT_VERSION="$(read_version "$VERSION_FILE")"
+
+if [[ "$INCREMENT" == true ]]; then
+    VERSION="$(bump_version "$CURRENT_VERSION" "$BUMP")"
+    echo "==> Engine version bump: $CURRENT_VERSION → $VERSION ($BUMP)"
+else
+    VERSION="$CURRENT_VERSION"
+    echo "==> Engine version: $VERSION (no increment)"
+fi
+
+TAG="engine-v${VERSION}"
+
+echo "    Tag:       $TAG"
+echo "    Registry:  $IMAGE_BASE"
+echo "    Forge CLI: $FORGE"
+echo "    Dry run:   $DRY_RUN"
+echo ""
+
 #──────────────────────────────────────────────────────────────────────
 # 1. Pre-flight checks
 #──────────────────────────────────────────────────────────────────────
 echo "==> Pre-flight checks"

 if [[ "$DRY_RUN" == false ]]; then
-    # Check tag doesn't already exist
    if git -C "$SCRIPT_DIR" rev-parse "$TAG" &>/dev/null; then
        echo "Error: tag $TAG already exists"
        exit 1
@@ -151,37 +135,16 @@ echo "    OK"
 echo ""

 #──────────────────────────────────────────────────────────────────────
-# 2. Update version files
+# 2. Update version file
 #──────────────────────────────────────────────────────────────────────
 if [[ "$INCREMENT" == true ]]; then
-    echo "==> Updating version files to $VERSION"
-    run write_version "$CLIENT_VERSION_FILE" "$VERSION"
-    run write_version "$ENGINE_VERSION_FILE" "$VERSION"
+    echo "==> Updating engine version to $VERSION"
+    run write_version "$VERSION_FILE" "$VERSION"
    echo ""
 fi

 #──────────────────────────────────────────────────────────────────────
-# 3. Build Go client binaries
-#──────────────────────────────────────────────────────────────────────
-echo "==> Building Go client binaries ($VERSION)"
-
-run make -C "$CLIENT_DIR" clean
-run make -C "$CLIENT_DIR" all VERSION="$VERSION"
-
-# Collect release assets
-ASSETS=()
-if [[ "$DRY_RUN" == false ]]; then
-    for bin in "$CLIENT_DIR"/dist/kb-*; do
-        ASSETS+=("$bin")
-    done
-    echo "    Built ${#ASSETS[@]} binaries"
-else
-    echo "    (skipped — dry run)"
-fi
-echo ""
-
-#──────────────────────────────────────────────────────────────────────
-# 4. Build Docker engine images
+# 3. Build Docker engine images
 #──────────────────────────────────────────────────────────────────────
 echo "==> Building Docker engine images ($VERSION)"

@@ -196,13 +159,13 @@ run docker build -t "$ROCM_IMAGE" -t "$ROCM_LATEST" -f "$ENGINE_DIR/Dockerfile.r
 echo ""

 #──────────────────────────────────────────────────────────────────────
-# 5. Commit version bump, tag, and push
+# 4. Commit, tag, and push
 #──────────────────────────────────────────────────────────────────────
 echo "==> Committing and tagging $TAG"

 if [[ "$INCREMENT" == true ]]; then
-    run git -C "$SCRIPT_DIR" add "$CLIENT_VERSION_FILE" "$ENGINE_VERSION_FILE"
-    run git -C "$SCRIPT_DIR" commit -m "Bump version to $VERSION"
+    run git -C "$SCRIPT_DIR" add "$VERSION_FILE"
+    run git -C "$SCRIPT_DIR" commit -m "Bump engine version to $VERSION"
 fi

 run git -C "$SCRIPT_DIR" tag -a "$TAG" -m "Release $TAG"
@@ -212,11 +175,11 @@ run git -C "$SCRIPT_DIR" push origin "$TAG"
 echo ""

 #──────────────────────────────────────────────────────────────────────
-# 6. Create release with assets
+# 5. Create release
 #──────────────────────────────────────────────────────────────────────
 echo "==> Creating release via $FORGE"

-RELEASE_TITLE="$TAG"
+RELEASE_TITLE="Engine $TAG"
 RELEASE_NOTES="## Docker images

 \`\`\`bash
@@ -225,38 +188,24 @@ docker pull ${NVIDIA_IMAGE}

 # AMD GPU (ROCm)
 docker pull ${ROCM_IMAGE}
-\`\`\`
-
-## Client binaries
-
-Download the binary for your platform from the assets below, rename to \`kb\`, and place on your PATH."
+\`\`\`"

 if [[ "$FORGE" == "gh" ]]; then
-    ASSET_FLAGS=()
-    for f in "${ASSETS[@]+"${ASSETS[@]}"}"; do
-        ASSET_FLAGS+=("$f")
-    done
    run gh release create "$TAG" \
        --title "$RELEASE_TITLE" \
-        --notes "$RELEASE_NOTES" \
-        "${ASSET_FLAGS[@]+"${ASSET_FLAGS[@]}"}"
+        --notes "$RELEASE_NOTES"

 elif [[ "$FORGE" == "tea" ]]; then
    run tea release create \
        --tag "$TAG" \
        --title "$RELEASE_TITLE" \
        --note "$RELEASE_NOTES"
-
-    # tea attaches assets as positional args: tea release asset create <tag> <file>...
-    for f in "${ASSETS[@]+"${ASSETS[@]}"}"; do
-        run tea release asset create "$TAG" "$f"
-    done
 fi

 echo ""

 #──────────────────────────────────────────────────────────────────────
-# 7. Push Docker images to registry
+# 6. Push Docker images to registry
 #──────────────────────────────────────────────────────────────────────
 echo "==> Pushing Docker images to $REGISTRY"

@@ -271,5 +220,3 @@ echo ""
 echo "    Images:"
 echo "      $NVIDIA_IMAGE"
 echo "      $ROCM_IMAGE"
-echo ""
-echo "    Binaries: ${#ASSETS[@]} platform(s) attached to release"
Author	SHA1	Message	Date
steve	0f3b3be59f	Bump engine version to 2.1.0	2026-03-29 21:06:04 +01:00
steve	2fa2ac1134	Reject single bare word as implicit note shorthand Single unrecognized words now print an error with usage hint instead of being submitted as a note. Prevents typos from creating junk notes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 21:03:52 +01:00
steve	b2176c36ea	Chunk enrichment: prepend document title to embeddings Adds enriched_text column to chunks table that prepends document title (and section header when present) to chunk text. Embeddings and FTS now use enriched text for better search relevance. Includes schema migration with backfill for existing data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 21:03:48 +01:00
steve	5f9946efc9	Added manual download to README	2026-03-29 14:03:35 +01:00
steve	ea3d5707e1	Bump client version to 2.1.0	2026-03-29 13:58:42 +01:00
steve	7f4decee26	Reindex command, implicit note shorthand, add→addfile rename - Add `kb reindex` command with confirmation prompt and --yes flag - Add implicit note shorthand: `kb "my note"` submits a note directly - Rename `add` to `addfile`, remove --note/--title/--type flags - Add client-side file extension validation before upload - Add `kb examples` command for common usage patterns - Update README, SKILL.md, and main specs - Archive completed changes and sync delta specs BREAKING: `kb add` renamed to `kb addfile`, `kb add --note` replaced by `kb "text"` Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 13:58:04 +01:00
steve	528a09ca90	Independent client/engine versioning with compatibility check Split release.sh into release-client.sh and release-engine.sh for independent release cadences. Client checks engine version on first API call and hard-fails if engine is below MinEngineVersion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 15:59:16 +00:00
steve	b04823e67b	Store original documents for download after ingestion Persist uploaded files to {data_dir}/documents/{content_hash}{ext} after successful ingestion. Add GET /documents/{id}/file endpoint for retrieval, delete stored files on document deletion, and add `kb export` client command. Includes schema migration, tests, and spec updates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 15:16:27 +00:00
steve	6a4bce4659	Bump version to 2.0.5	2026-03-26 23:08:48 +00:00
steve	4590c124ad	Merge pull request 'Upload-time dedup, FTS5 query sanitization, release guard' (#1 ) from 2.0.5 into main Reviewed-on: #1	2026-03-26 23:06:08 +00:00
@@ -1 +1 @@
 .0.4
 .1.0