2d179af557
The Go client struct expected a nested document object and top-level page/section fields, but the engine returns flat results with metadata in chunk_metadata. This caused empty display for title, type, tags, page, and section in human output mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
309 lines
15 KiB
Markdown
309 lines
15 KiB
Markdown
# Go Client
|
|
|
|
## Purpose
|
|
|
|
The Go client (`kb`) provides a command-line interface for interacting with the knowledge base engine, supporting search, document ingestion, job tracking, document management, tag management, and status display.
|
|
|
|
## Requirements
|
|
|
|
### Requirement: Single static binary with zero runtime dependencies
|
|
|
|
The Go client SHALL compile to a single static binary with no runtime dependencies. It SHALL support cross-compilation for Linux (amd64, arm64), macOS (amd64, arm64), and Windows (amd64). The build SHALL inject both `Version` and `MinEngineVersion` via ldflags.
|
|
|
|
#### Scenario: Install on a clean machine
|
|
- **WHEN** a user downloads the `kb` binary for their platform
|
|
- **THEN** they SHALL be able to run it immediately with no additional installs (no Python, no Docker, no shared libraries)
|
|
|
|
#### Scenario: Version and compatibility info embedded at build time
|
|
- **WHEN** the client is built with `make all VERSION=2.1.0 MIN_ENGINE_VERSION=2.0.0`
|
|
- **THEN** `kb --version` SHALL report `2.1.0` and the compatibility check SHALL use `2.0.0` as the minimum engine version
|
|
|
|
---
|
|
|
|
### Requirement: Client configuration
|
|
|
|
The client SHALL read configuration from `~/.kb/client.yaml`. Configuration values SHALL be overridable via environment variables and CLI flags. Precedence: CLI flags > environment variables > config file > defaults.
|
|
|
|
#### Scenario: Default configuration
|
|
- **WHEN** no config file exists and no env vars or flags are set
|
|
- **THEN** the client SHALL use defaults: engine URL `http://localhost:8000`, no API key, format `human`
|
|
|
|
#### Scenario: Config file
|
|
- **WHEN** `~/.kb/client.yaml` contains `engine_url: https://kb.example.com`
|
|
- **THEN** the client SHALL use that URL for all API requests
|
|
|
|
#### Scenario: Environment variable override
|
|
- **WHEN** `KB_ENGINE_URL` is set
|
|
- **THEN** it SHALL override the config file value
|
|
|
|
#### Scenario: CLI flag override
|
|
- **WHEN** the user passes `--engine https://other.host:8000`
|
|
- **THEN** it SHALL override both the config file and environment variable
|
|
|
|
#### Scenario: Engine unreachable
|
|
- **WHEN** the client cannot connect to the engine URL
|
|
- **THEN** it SHALL print a clear error message (e.g., "Cannot reach engine at http://localhost:8000 — is it running?") and exit with a non-zero code
|
|
|
|
---
|
|
|
|
### Requirement: Search command
|
|
|
|
The client SHALL provide a `kb search <query>` command that sends the query to the engine and displays results.
|
|
|
|
#### Scenario: Human-readable search output
|
|
- **WHEN** the user runs `kb search "how to change oil"`
|
|
- **THEN** the client SHALL POST to `/api/v1/search`, and display results in a human-readable format showing rank, score, document title, page/section, doc type, tags, and a text snippet
|
|
- **THEN** the client SHALL parse search results as flat objects with top-level `title`, `doc_type`, `tags`, `score`, `text`, `chunk_index` fields
|
|
- **THEN** the client SHALL extract `page` from `chunk_metadata` when present (PDF documents)
|
|
- **THEN** the client SHALL extract `section_header` from `chunk_metadata` when present (markdown documents)
|
|
|
|
#### Scenario: JSON search output
|
|
- **WHEN** the user runs `kb search "query" --format json`
|
|
- **THEN** the client SHALL output the raw JSON response from the engine
|
|
|
|
#### Scenario: Search with filters
|
|
- **WHEN** the user runs `kb search "brakes" --tags maintenance --type pdf --top 3`
|
|
- **THEN** the client SHALL include the filters in the API request body
|
|
|
|
#### Scenario: Search mode flags
|
|
- **WHEN** the user runs `kb search "error" --fts-only`
|
|
- **THEN** the client SHALL set `fts_only: true` in the request body
|
|
|
|
#### Scenario: PDF result with page number
|
|
- **WHEN** a search result has `chunk_metadata` containing `{"page": 12}`
|
|
- **THEN** the human output SHALL display "Page 12" in the location line
|
|
|
|
#### Scenario: Markdown result with section header
|
|
- **WHEN** a search result has `chunk_metadata` containing `{"section_header": "Installation > Prerequisites"}`
|
|
- **THEN** the human output SHALL display "Installation > Prerequisites" in the location line
|
|
|
|
#### Scenario: Result with both page and section
|
|
- **WHEN** a search result has `chunk_metadata` containing both `page` and `section_header`
|
|
- **THEN** the human output SHALL display both separated by " / "
|
|
|
|
#### Scenario: Result with no location metadata
|
|
- **WHEN** a search result has empty `chunk_metadata` or no page/section keys
|
|
- **THEN** the human output SHALL omit the location line entirely
|
|
|
|
---
|
|
|
|
### Requirement: Add note command
|
|
|
|
The client SHALL provide a `kb addnote <text>` command that submits a text note to the engine for ingestion. The command SHALL take exactly one positional argument (the note text) and support a `--tags` flag for comma-separated tags. The note SHALL be submitted via `POST /api/v1/jobs` with the `note` field in a multipart request.
|
|
|
|
#### Scenario: Add a note
|
|
- **WHEN** the user runs `kb addnote "remember to update DNS records"`
|
|
- **THEN** the client SHALL submit the text as a note via `POST /api/v1/jobs` and print `Queued: note`
|
|
|
|
#### Scenario: Add a note with tags
|
|
- **WHEN** the user runs `kb addnote "server room is building 3" --tags ops`
|
|
- **THEN** the client SHALL submit the note with the specified tags
|
|
|
|
#### Scenario: Add a note with JSON output
|
|
- **WHEN** the user runs `kb addnote "my note" --format json`
|
|
- **THEN** the client SHALL output the raw JSON response from the engine
|
|
|
|
#### Scenario: Duplicate note detection
|
|
- **WHEN** the user runs `kb addnote "my note"` and the engine returns HTTP 409
|
|
- **THEN** the client SHALL display the duplicate information (document ID or job ID) and exit with code 0
|
|
|
|
#### Scenario: Missing argument
|
|
- **WHEN** the user runs `kb addnote` with no arguments
|
|
- **THEN** the client SHALL display an error indicating that the note text argument is required
|
|
|
|
#### Scenario: Too many arguments
|
|
- **WHEN** the user runs `kb addnote remember to update dns` (unquoted, multiple args)
|
|
- **THEN** the client SHALL display an error indicating that exactly one argument is required, with a hint to quote the text
|
|
|
|
---
|
|
|
|
### Requirement: Add command (file and note ingestion)
|
|
|
|
The client SHALL provide a `kb addfile` command that uploads files to the engine for async ingestion. The command SHALL validate file extensions before uploading and reject unsupported types. The client SHALL handle duplicate rejection (HTTP 409) and display the existing document information. Notes are handled by the separate `addnote` command — `addfile` is exclusively for file uploads.
|
|
|
|
#### Scenario: Add a single file
|
|
- **WHEN** the user runs `kb addfile report.pdf`
|
|
- **THEN** the client SHALL validate the file extension, upload the file via `POST /api/v1/jobs` (multipart), print "Queued: report.pdf", and exit
|
|
|
|
#### Scenario: Add a file with tags
|
|
- **WHEN** the user runs `kb addfile manual.pdf --tags car,maintenance`
|
|
- **THEN** the client SHALL include the tags in the multipart upload metadata
|
|
|
|
#### Scenario: Add a directory recursively
|
|
- **WHEN** the user runs `kb addfile ~/documents/ --recursive`
|
|
- **THEN** the client SHALL discover all supported files in the directory tree, upload each one sequentially, and print "Queued: N files"
|
|
|
|
#### Scenario: Unsupported file extension
|
|
- **WHEN** the user runs `kb addfile photo.jpg`
|
|
- **THEN** the client SHALL print an error listing supported extensions and exit with a non-zero code without making any API call
|
|
|
|
#### Scenario: Duplicate file rejected (already ingested)
|
|
- **WHEN** the user runs `kb addfile report.pdf` and the engine returns HTTP 409 with `{"error": "duplicate", "document_id": 42, "title": "report.pdf"}`
|
|
- **THEN** the client SHALL print "Already imported: report.pdf (doc ID: 42)" and exit with code 0
|
|
|
|
#### Scenario: Duplicate file rejected (in-flight job)
|
|
- **WHEN** the user runs `kb addfile report.pdf` and the engine returns HTTP 409 with `{"error": "duplicate", "job_id": 7, "title": "report.pdf"}`
|
|
- **THEN** the client SHALL print "Already queued: report.pdf (job ID: 7)" and exit with code 0
|
|
|
|
#### Scenario: Duplicate file in recursive add
|
|
- **WHEN** the user runs `kb addfile ~/documents/ --recursive` and some files are rejected as duplicates
|
|
- **THEN** the client SHALL print the duplicate message for each rejected file, continue uploading remaining files, and include a summary (e.g., "Queued: 5 files, 2 duplicates skipped")
|
|
|
|
#### Scenario: Duplicate with JSON output
|
|
- **WHEN** the user runs `kb addfile report.pdf --format json` and the engine returns HTTP 409
|
|
- **THEN** the client SHALL output the raw JSON response from the engine including the document_id and title
|
|
|
|
#### Scenario: Add with JSON output
|
|
- **WHEN** the user runs `kb addfile report.pdf --format json`
|
|
- **THEN** the client SHALL output the JSON response from the engine including the job_id
|
|
|
|
#### Scenario: File not found
|
|
- **WHEN** the user runs `kb addfile nonexistent.pdf`
|
|
- **THEN** the client SHALL print an error and exit with a non-zero code without making any API call
|
|
|
|
#### Scenario: Upload failure
|
|
- **WHEN** the upload fails (network error, engine returns 4xx/5xx other than 409)
|
|
- **THEN** the client SHALL print the error and exit with a non-zero code
|
|
|
|
---
|
|
|
|
### Requirement: Jobs command
|
|
|
|
The client SHALL provide a `kb jobs` command to view the ingestion queue.
|
|
|
|
#### Scenario: List all jobs
|
|
- **WHEN** the user runs `kb jobs`
|
|
- **THEN** the client SHALL fetch `GET /api/v1/jobs` and display a table of recent jobs showing ID, filename, status, and timestamp
|
|
|
|
#### Scenario: Filter jobs by status
|
|
- **WHEN** the user runs `kb jobs --status failed`
|
|
- **THEN** the client SHALL pass the status filter and display only matching jobs
|
|
|
|
#### Scenario: Job details
|
|
- **WHEN** the user runs `kb jobs <id>`
|
|
- **THEN** the client SHALL fetch `GET /api/v1/jobs/{id}` and display full job details including error message (if failed), document_id (if done), and chunk count
|
|
|
|
---
|
|
|
|
### Requirement: Document management commands
|
|
|
|
The client SHALL provide commands to list, inspect, and remove documents.
|
|
|
|
#### Scenario: List documents
|
|
- **WHEN** the user runs `kb list`
|
|
- **THEN** the client SHALL fetch `GET /api/v1/documents` and display a table of documents with ID, title, type, tags, chunk count, and date
|
|
|
|
#### Scenario: List with filters
|
|
- **WHEN** the user runs `kb list --type pdf --tags manual`
|
|
- **THEN** the client SHALL pass filters as query parameters
|
|
|
|
#### Scenario: Document info
|
|
- **WHEN** the user runs `kb info <id>`
|
|
- **THEN** the client SHALL fetch `GET /api/v1/documents/{id}` and display full document details
|
|
|
|
#### Scenario: Remove a document
|
|
- **WHEN** the user runs `kb remove <id>`
|
|
- **THEN** the client SHALL prompt for confirmation, then send `DELETE /api/v1/documents/{id}` and display the result
|
|
|
|
#### Scenario: Remove with skip confirmation
|
|
- **WHEN** the user runs `kb remove <id> --yes`
|
|
- **THEN** the client SHALL skip the confirmation prompt
|
|
|
|
---
|
|
|
|
### Requirement: Tag management commands
|
|
|
|
The client SHALL provide commands to list and manage tags.
|
|
|
|
#### Scenario: List tags
|
|
- **WHEN** the user runs `kb tags`
|
|
- **THEN** the client SHALL fetch `GET /api/v1/tags` and display tags with document counts
|
|
|
|
#### Scenario: Add tags to a document
|
|
- **WHEN** the user runs `kb tag <id> --add manual,v2`
|
|
- **THEN** the client SHALL send `PUT /api/v1/documents/{id}/tags` with the add payload
|
|
|
|
#### Scenario: Remove tags from a document
|
|
- **WHEN** the user runs `kb tag <id> --remove draft`
|
|
- **THEN** the client SHALL send `PUT /api/v1/documents/{id}/tags` with the remove payload
|
|
|
|
---
|
|
|
|
### Requirement: Status command
|
|
|
|
The client SHALL provide a `kb status` command to display engine status.
|
|
|
|
#### Scenario: Display engine status
|
|
- **WHEN** the user runs `kb status`
|
|
- **THEN** the client SHALL fetch `GET /api/v1/status` and display model name, embedding dimensions, GPU info, document counts by type, total chunks, database size, and queue status
|
|
|
|
---
|
|
|
|
### Requirement: Reindex command
|
|
|
|
The client SHALL provide a `kb reindex` command that triggers re-embedding of all chunks on the engine. The command SHALL prompt for confirmation before proceeding.
|
|
|
|
#### Scenario: Reindex with confirmation
|
|
- **WHEN** the user runs `kb reindex`
|
|
- **THEN** the client SHALL display a warning that all chunks will be re-embedded and prompt `Reindex all chunks? This will re-embed everything. [y/N]`. If confirmed, it SHALL POST to `/api/v1/reindex` and display the result.
|
|
|
|
#### Scenario: Reindex with skip confirmation
|
|
- **WHEN** the user runs `kb reindex --yes`
|
|
- **THEN** the client SHALL skip the confirmation prompt and POST to `/api/v1/reindex` immediately
|
|
|
|
#### Scenario: Reindex cancelled
|
|
- **WHEN** the user runs `kb reindex` and responds with anything other than `y` or `yes`
|
|
- **THEN** the client SHALL print `Cancelled.` and exit with code 0
|
|
|
|
#### Scenario: Reindex human output
|
|
- **WHEN** the reindex completes successfully with default format
|
|
- **THEN** the client SHALL print `Reindexed N chunks (model: <model_name>)`
|
|
|
|
#### Scenario: Reindex JSON output
|
|
- **WHEN** the user runs `kb reindex --yes --format json`
|
|
- **THEN** the client SHALL output the raw JSON response from the engine
|
|
|
|
---
|
|
|
|
### Requirement: Engine version compatibility check
|
|
|
|
The client SHALL verify that the connected engine meets a minimum version requirement before executing any API command. The minimum required engine version SHALL be embedded in the client binary at build time. If the engine version is below the minimum, the client SHALL print an error message and exit with a non-zero code. There SHALL be no flag to skip or suppress this check.
|
|
|
|
#### Scenario: Compatible engine version
|
|
- **WHEN** the client connects to an engine reporting version `2.1.5` and `MinEngineVersion` is `2.1.0`
|
|
- **THEN** the client SHALL proceed with the command normally
|
|
|
|
#### Scenario: Incompatible engine version
|
|
- **WHEN** the client connects to an engine reporting version `2.0.3` and `MinEngineVersion` is `2.1.0`
|
|
- **THEN** the client SHALL print to stderr: `Error: kb client vX.Y.Z requires engine v2.1.0+ (connected engine is v2.0.3)` followed by an upgrade hint, and exit with code 1
|
|
|
|
#### Scenario: Engine unreachable during version check
|
|
- **WHEN** the client cannot reach the engine's `/api/v1/status` endpoint
|
|
- **THEN** the client SHALL skip the version check and proceed with the original command (the actual API call will surface the connectivity error)
|
|
|
|
#### Scenario: Version check is cached per session
|
|
- **WHEN** the client has already verified engine compatibility during the current invocation
|
|
- **THEN** subsequent API calls within the same invocation SHALL NOT repeat the version check
|
|
|
|
#### Scenario: Client version command does not check engine
|
|
- **WHEN** the user runs `kb --version`
|
|
- **THEN** the client SHALL print the client version without contacting the engine
|
|
|
|
#### Scenario: MinEngineVersion not set
|
|
- **WHEN** the client binary has `MinEngineVersion` set to empty string or `dev`
|
|
- **THEN** the client SHALL skip the version check entirely (development builds)
|
|
|
|
---
|
|
|
|
### Requirement: Global output format flag
|
|
|
|
All commands SHALL support a `--format` flag accepting `human` (default) or `json`. The default MAY be changed via the `default_format` config value.
|
|
|
|
#### Scenario: JSON output on any command
|
|
- **WHEN** the user passes `--format json` to any command
|
|
- **THEN** the client SHALL output the raw JSON response from the engine without human formatting
|
|
|
|
#### Scenario: Human output (default)
|
|
- **WHEN** the user runs any command without `--format`
|
|
- **THEN** the client SHALL format the response in a human-readable table or structured text output
|