Remove AMD ROCm support — CPU and NVIDIA only

BREAKING: Remove Dockerfile.rocm, compose.rocm.yaml, and ROCm image
build/push from the release pipeline. Remove AMD quick-start and ROCm
references from README and DEVELOPER docs. Update docker-deployment
and developer-docs specs to reflect CPU + NVIDIA only.

The ROCm variant added significant complexity (4.2GB torch wheel,
>20GB container) with limited usage. Users on AMD GPUs should stay
on engine v3.2.x or switch to CPU mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-06 16:39:37 +01:00
parent 17b19999de
commit 574370e8d1
12 changed files with 174 additions and 185 deletions
+2 -17
View File
@@ -12,7 +12,7 @@ Go CLI (kb) ──HTTP──▶ FastAPI Engine (Docker) ──▶ SQLite + GPU
MCP Agents ──MCP/HTTP──▶ MCP Server (Docker) ──┘
```
- **Engine**: Keeps the embedding model warm in memory. Handles search, ingestion, document management, and note mutation via REST API. Runs in Docker with NVIDIA GPU, AMD GPU (ROCm), or CPU-only support.
- **Engine**: Keeps the embedding model warm in memory. Handles search, ingestion, document management, and note mutation via REST API. Runs in Docker with NVIDIA GPU or CPU-only support.
- **Client**: Single static Go binary. No Python, no ML dependencies, instant startup. Talks to the engine over HTTP.
- **MCP Server**: Exposes kb operations as native MCP tools over Streamable HTTP. Runs as a separate Docker container alongside the engine. Use tags to scope agent data from user documents.
- **Storage**: Single SQLite database with FTS5 (keyword search) and sqlite-vec (vector search). Portable via bind mount — just copy the data directory between hosts.
@@ -35,18 +35,6 @@ docker run -d --name kb-engine \
--restart unless-stopped \
docker.dcglab.co.uk/dcg/kb/engine:latest-nvidia
# AMD GPU (ROCm)
docker run -d --name kb-engine \
--device /dev/kfd --device /dev/dri \
--group-add video \
-p 8000:8000 \
-v ~/kb-data:/data \
-e KB_MODEL=all-MiniLM-L6-v2 \
-e KB_DEVICE=auto \
-e KB_API_KEY=your-secret-key \
--restart unless-stopped \
docker.dcglab.co.uk/dcg/kb/engine:latest-rocm
# CPU only (no GPU required — smaller image)
docker run -d --name kb-engine \
-p 8000:8000 \
@@ -63,9 +51,6 @@ Or use a compose file from the repo:
# NVIDIA GPU
KB_DATA_PATH=~/kb-data docker compose -f engine/compose.nvidia.yaml up -d
# AMD GPU (ROCm)
KB_DATA_PATH=~/kb-data docker compose -f engine/compose.rocm.yaml up -d
# CPU only
KB_DATA_PATH=~/kb-data docker compose -f engine/compose.cpu.yaml up -d
```
@@ -192,7 +177,7 @@ rsync -a ~/kb-data/ user@target:/home/user/kb-data/
KB_DATA_PATH=~/kb-data docker compose -f compose.nvidia.yaml up -d
```
Data is device-agnostic — you can ingest on NVIDIA and serve from AMD or CPU (or any combination) with the same data directory.
Data is device-agnostic — you can ingest on NVIDIA and serve from CPU (or vice versa) with the same data directory.
## MCP server (agent integration)