Add GPU device control, Docker support, and v2 client-server design

- Add configurable device selection for embeddings (embedding.device) and
  Docling ingestion (ingestion.device) with env var overrides (KB_DEVICE,
  KB_INGEST_DEVICE) to control GPU/CPU usage per component
- Add `kb doctor` command for safe GPU diagnostics
- Add Dockerfile (NVIDIA CUDA) and compose.yaml for containerised GPU usage
- Add OpenSpec v2 change (kb-v2-client-server): proposal, design, specs, and
  tasks for client-server architecture with Go CLI, FastAPI engine, async
  ingestion queue, and GPU-vendor-agnostic Docker deployment
- Add uv.lock for reproducible installs
- Gitignore examples/ directory (test data only)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-25 20:17:31 +00:00
parent f245c24928
commit 2030976b85
20 changed files with 4321 additions and 29 deletions
@@ -0,0 +1,177 @@
## ADDED Requirements
### Requirement: Single static binary with zero runtime dependencies
The Go client SHALL compile to a single static binary with no runtime dependencies. It SHALL support cross-compilation for Linux (amd64, arm64), macOS (amd64, arm64), and Windows (amd64).
#### Scenario: Install on a clean machine
- **WHEN** a user downloads the `kb` binary for their platform
- **THEN** they SHALL be able to run it immediately with no additional installs (no Python, no Docker, no shared libraries)
---
### Requirement: Client configuration
The client SHALL read configuration from `~/.kb/client.yaml`. Configuration values SHALL be overridable via environment variables and CLI flags. Precedence: CLI flags > environment variables > config file > defaults.
#### Scenario: Default configuration
- **WHEN** no config file exists and no env vars or flags are set
- **THEN** the client SHALL use defaults: engine URL `http://localhost:8000`, no API key, format `human`
#### Scenario: Config file
- **WHEN** `~/.kb/client.yaml` contains `engine_url: https://kb.example.com`
- **THEN** the client SHALL use that URL for all API requests
#### Scenario: Environment variable override
- **WHEN** `KB_ENGINE_URL` is set
- **THEN** it SHALL override the config file value
#### Scenario: CLI flag override
- **WHEN** the user passes `--engine https://other.host:8000`
- **THEN** it SHALL override both the config file and environment variable
#### Scenario: Engine unreachable
- **WHEN** the client cannot connect to the engine URL
- **THEN** it SHALL print a clear error message (e.g., "Cannot reach engine at http://localhost:8000 — is it running?") and exit with a non-zero code
---
### Requirement: Search command
The client SHALL provide a `kb search <query>` command that sends the query to the engine and displays results.
#### Scenario: Human-readable search output
- **WHEN** the user runs `kb search "how to change oil"`
- **THEN** the client SHALL POST to `/api/v1/search`, and display results in a human-readable format showing rank, score, document title, page/section, doc type, tags, and a text snippet
#### Scenario: JSON search output
- **WHEN** the user runs `kb search "query" --format json`
- **THEN** the client SHALL output the raw JSON response from the engine
#### Scenario: Search with filters
- **WHEN** the user runs `kb search "brakes" --tags maintenance --type pdf --top 3`
- **THEN** the client SHALL include the filters in the API request body
#### Scenario: Search mode flags
- **WHEN** the user runs `kb search "error" --fts-only`
- **THEN** the client SHALL set `fts_only: true` in the request body
---
### Requirement: Add command (file and note ingestion)
The client SHALL provide a `kb add` command that uploads files or notes to the engine for async ingestion. The client SHALL exit immediately after a successful upload.
#### Scenario: Add a single file
- **WHEN** the user runs `kb add report.pdf`
- **THEN** the client SHALL upload the file via `POST /api/v1/jobs` (multipart), print "Queued: report.pdf", and exit
#### Scenario: Add a file with tags
- **WHEN** the user runs `kb add manual.pdf --tags car,maintenance`
- **THEN** the client SHALL include the tags in the multipart upload metadata
#### Scenario: Add a directory recursively
- **WHEN** the user runs `kb add ~/documents/ --recursive`
- **THEN** the client SHALL discover all supported files in the directory tree, upload each one sequentially, and print "Queued: N files"
#### Scenario: Add a text note
- **WHEN** the user runs `kb add --note "The server room is in building 3, floor 2"`
- **THEN** the client SHALL submit the note text via `POST /api/v1/jobs` (multipart with note field), print "Queued: note", and exit
#### Scenario: Add with JSON output
- **WHEN** the user runs `kb add report.pdf --format json`
- **THEN** the client SHALL output the JSON response from the engine including the job_id
#### Scenario: File not found
- **WHEN** the user runs `kb add nonexistent.pdf`
- **THEN** the client SHALL print an error and exit with a non-zero code without making any API call
#### Scenario: Upload failure
- **WHEN** the upload fails (network error, engine returns 4xx/5xx)
- **THEN** the client SHALL print the error and exit with a non-zero code
---
### Requirement: Jobs command
The client SHALL provide a `kb jobs` command to view the ingestion queue.
#### Scenario: List all jobs
- **WHEN** the user runs `kb jobs`
- **THEN** the client SHALL fetch `GET /api/v1/jobs` and display a table of recent jobs showing ID, filename, status, and timestamp
#### Scenario: Filter jobs by status
- **WHEN** the user runs `kb jobs --status failed`
- **THEN** the client SHALL pass the status filter and display only matching jobs
#### Scenario: Job details
- **WHEN** the user runs `kb jobs <id>`
- **THEN** the client SHALL fetch `GET /api/v1/jobs/{id}` and display full job details including error message (if failed), document_id (if done), and chunk count
---
### Requirement: Document management commands
The client SHALL provide commands to list, inspect, and remove documents.
#### Scenario: List documents
- **WHEN** the user runs `kb list`
- **THEN** the client SHALL fetch `GET /api/v1/documents` and display a table of documents with ID, title, type, tags, chunk count, and date
#### Scenario: List with filters
- **WHEN** the user runs `kb list --type pdf --tags manual`
- **THEN** the client SHALL pass filters as query parameters
#### Scenario: Document info
- **WHEN** the user runs `kb info <id>`
- **THEN** the client SHALL fetch `GET /api/v1/documents/{id}` and display full document details
#### Scenario: Remove a document
- **WHEN** the user runs `kb remove <id>`
- **THEN** the client SHALL prompt for confirmation, then send `DELETE /api/v1/documents/{id}` and display the result
#### Scenario: Remove with skip confirmation
- **WHEN** the user runs `kb remove <id> --yes`
- **THEN** the client SHALL skip the confirmation prompt
---
### Requirement: Tag management commands
The client SHALL provide commands to list and manage tags.
#### Scenario: List tags
- **WHEN** the user runs `kb tags`
- **THEN** the client SHALL fetch `GET /api/v1/tags` and display tags with document counts
#### Scenario: Add tags to a document
- **WHEN** the user runs `kb tag <id> --add manual,v2`
- **THEN** the client SHALL send `PUT /api/v1/documents/{id}/tags` with the add payload
#### Scenario: Remove tags from a document
- **WHEN** the user runs `kb tag <id> --remove draft`
- **THEN** the client SHALL send `PUT /api/v1/documents/{id}/tags` with the remove payload
---
### Requirement: Status command
The client SHALL provide a `kb status` command to display engine status.
#### Scenario: Display engine status
- **WHEN** the user runs `kb status`
- **THEN** the client SHALL fetch `GET /api/v1/status` and display model name, embedding dimensions, GPU info, document counts by type, total chunks, database size, and queue status
---
### Requirement: Global output format flag
All commands SHALL support a `--format` flag accepting `human` (default) or `json`. The default MAY be changed via the `default_format` config value.
#### Scenario: JSON output on any command
- **WHEN** the user passes `--format json` to any command
- **THEN** the client SHALL output the raw JSON response from the engine without human formatting
#### Scenario: Human output (default)
- **WHEN** the user runs any command without `--format`
- **THEN** the client SHALL format the response in a human-readable table or structured text output