The standalone Registry v2 host (docker.dcglab.co.uk, briefly registry.dcglab.co.uk)
is being scrapped. Move all kb images to Gitea's built-in container registry.
Agents were misreading kb_search as keyword-only because the vector/semantic
component was only mentioned in the negative ("fts_only: no vector similarity").
Lead with hybrid semantic + BM25 + RRF in the server instructions, kb_search
docstring, and MCP.md so agents recognise it as a vector search tool.
Captures pain points found while trying to locate an uploaded PDF: kb
list silently ignores positional args, kb search results lack
document_id, kb info dumps all chunks with no summary mode, and
scan-heavy PDFs produce noisy single-char chunk hits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BREAKING: Remove Dockerfile.rocm, compose.rocm.yaml, and ROCm image
build/push from the release pipeline. Remove AMD quick-start and ROCm
references from README and DEVELOPER docs. Update docker-deployment
and developer-docs specs to reflect CPU + NVIDIA only.
The ROCm variant added significant complexity (4.2GB torch wheel,
>20GB container) with limited usage. Users on AMD GPUs should stay
on engine v3.2.x or switch to CPU mode.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Nvidia: install torch+torchvision from PyTorch cu130 index, drop
onnxruntime-gpu. ROCm: use local torch wheel with rocm6.4 index for
torchvision, clean up nvidia remnants from the venv.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sanitize / and \ in note titles and filenames when writing to the
staging directory — a title like "/reset skill" was interpreted as a
path separator, causing a FileNotFoundError and a 500 from the jobs
endpoint. Also add PRAGMA busy_timeout=5000 to SQLite connections to
prevent immediate failure under concurrent write load.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The MCP SDK's DNS rebinding protection rejects remote clients with 421
when the Host header isn't in the allowlist. Add KB_MCP_ALLOWED_HOSTS env
var (comma-separated IPs/FQDNs) to configure additional allowed hosts
while keeping localhost always permitted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- kb_status now returns authenticated: true/false so clients can verify auth
- Server instructions mention Bearer token auth requirement
- Add .env, .venv/, test_mcp_client.py to .gitignore
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move MCP server documentation from README into dedicated MCP.md.
Add configuration examples for Claude Code, VS Code, Cursor,
Windsurf, and JetBrains IDEs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the engine returns 401 (auth required) or other non-200 responses,
the version check was parsing the error body, getting an empty version
string, and fatally exiting. Now skips the check on non-200 responses
and lets the actual API call surface the real error.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New MCP server (mcp/) exposes kb operations as native MCP tools over
Streamable HTTP with Bearer token auth. Supports collections via tag
conventions, chunked file uploads, and agent-side search patterns.
Engine gains PATCH /api/v1/notes/{id} for in-place note updates with
transactional re-chunk/re-embed, and updated_at column on documents.
Go client adds updatenote command and Patch HTTP method.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Go client struct expected a nested document object and top-level
page/section fields, but the engine returns flat results with metadata
in chunk_metadata. This caused empty display for title, type, tags,
page, and section in human output mode.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add Dockerfile.cpu and compose.cpu.yaml for CPU-only deployments
- Use sentence-transformers[onnx] + CPU-only torch for ~4x smaller image
- Fix release script: separate git tags (engine-v*) from Docker tags (v*)
- Add CPU image to release build/push pipeline
- Update README with CPU deployment instructions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two changes:
1. structured-add-commands: The implicit note shorthand (kb "text") caused
accidental note creation from mistyped commands. Replaced with explicit
kb addnote <text> command. Root command reverts to standard Cobra
behaviour. Updated examples, tests, SKILL.md, and specs.
2. split-readme-developer-docs: Moved build-from-source instructions, release
process, API reference, and ROCm migration notes from README.md into a
new DEVELOPER.md. README now links to DEVELOPER.md for dev workflows.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Single unrecognized words now print an error with usage hint instead of
being submitted as a note. Prevents typos from creating junk notes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds enriched_text column to chunks table that prepends document title
(and section header when present) to chunk text. Embeddings and FTS now
use enriched text for better search relevance. Includes schema migration
with backfill for existing data.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split release.sh into release-client.sh and release-engine.sh for
independent release cadences. Client checks engine version on first
API call and hard-fails if engine is below MinEngineVersion.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Persist uploaded files to {data_dir}/documents/{content_hash}{ext} after
successful ingestion. Add GET /documents/{id}/file endpoint for retrieval,
delete stored files on document deletion, and add `kb export` client command.
Includes schema migration, tests, and spec updates.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Reject duplicate uploads at the API boundary (HTTP 409) instead of
silently skipping in the background worker. Checks both ingested
documents and in-flight jobs via content_hash on the jobs table.
- Go client handles 409 with distinct messages for already-imported
documents vs already-queued jobs.
- Sanitize FTS5 search queries by quoting each token to prevent syntax
errors from special characters like ?, *, ", (), AND, OR, NOT.
- Add try/except safety net around FTS5 execute for edge cases.
- Add main branch guard to release.sh to prevent releasing from
feature branches.
- Update specs and README to reflect new behaviour.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove v1 Python CLI (src/kb_search/, tests/, root pyproject.toml, uv.lock, .venv)
- Add Go client with cross-platform build (client/)
- Add FastAPI engine with NVIDIA and multi-stage ROCm Dockerfiles (engine/)
- Add VERSION files for client and engine, wired into builds
- Add release.sh for automated build, tag, release, and Docker push
- Update README with build/release docs and ROCm migration note
- Clean up .gitignore for v2 project structure
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add configurable device selection for embeddings (embedding.device) and
Docling ingestion (ingestion.device) with env var overrides (KB_DEVICE,
KB_INGEST_DEVICE) to control GPU/CPU usage per component
- Add `kb doctor` command for safe GPU diagnostics
- Add Dockerfile (NVIDIA CUDA) and compose.yaml for containerised GPU usage
- Add OpenSpec v2 change (kb-v2-client-server): proposal, design, specs, and
tasks for client-server architecture with Go CLI, FastAPI engine, async
ingestion queue, and GPU-vendor-agnostic Docker deployment
- Add uv.lock for reproducible installs
- Gitignore examples/ directory (test data only)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>