BREAKING: Remove Dockerfile.rocm, compose.rocm.yaml, and ROCm image
build/push from the release pipeline. Remove AMD quick-start and ROCm
references from README and DEVELOPER docs. Update docker-deployment
and developer-docs specs to reflect CPU + NVIDIA only.
The ROCm variant added significant complexity (4.2GB torch wheel,
>20GB container) with limited usage. Users on AMD GPUs should stay
on engine v3.2.x or switch to CPU mode.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New MCP server (mcp/) exposes kb operations as native MCP tools over
Streamable HTTP with Bearer token auth. Supports collections via tag
conventions, chunked file uploads, and agent-side search patterns.
Engine gains PATCH /api/v1/notes/{id} for in-place note updates with
transactional re-chunk/re-embed, and updated_at column on documents.
Go client adds updatenote command and Patch HTTP method.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add Dockerfile.cpu and compose.cpu.yaml for CPU-only deployments
- Use sentence-transformers[onnx] + CPU-only torch for ~4x smaller image
- Fix release script: separate git tags (engine-v*) from Docker tags (v*)
- Add CPU image to release build/push pipeline
- Update README with CPU deployment instructions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split release.sh into release-client.sh and release-engine.sh for
independent release cadences. Client checks engine version on first
API call and hard-fails if engine is below MinEngineVersion.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>