Files
kb/openspec/changes/kb-search/specs/configuration/spec.md
T
2026-03-23 20:38:42 +00:00

73 lines
3.6 KiB
Markdown

## ADDED Requirements
### Requirement: YAML configuration file
The system SHALL read configuration from `~/.kb/config.yaml`. If the file does not exist, the system SHALL use built-in defaults. The configuration file SHALL be optional — the tool MUST work with zero configuration.
#### Scenario: No config file
- **WHEN** `~/.kb/config.yaml` does not exist
- **THEN** the system uses built-in defaults for all settings and operates normally
#### Scenario: Partial config file
- **WHEN** `~/.kb/config.yaml` exists but only specifies `chunking.pdf.max_tokens: 2048`
- **THEN** the system uses built-in defaults for all other settings, overriding only `chunking.pdf.max_tokens`
#### Scenario: Invalid config file
- **WHEN** `~/.kb/config.yaml` contains invalid YAML
- **THEN** the system prints a clear error message identifying the YAML syntax issue and exits with non-zero status
### Requirement: Environment variable overrides
The system SHALL support environment variable overrides with the prefix `KB_`. ENV variables SHALL take precedence over the YAML config file. Supported variables: `KB_DATA_DIR`, `KB_MODEL`, `KB_DEFAULT_TOP`, `KB_DEFAULT_FORMAT`.
#### Scenario: Override data directory
- **WHEN** `KB_DATA_DIR=/tmp/test-kb` is set
- **THEN** the system uses `/tmp/test-kb/` instead of `~/.kb/` for the database and config
#### Scenario: Override model
- **WHEN** `KB_MODEL=nomic-embed-text` is set
- **THEN** the system uses `nomic-embed-text` as the embedding model, overriding the YAML config
#### Scenario: ENV overrides YAML
- **WHEN** YAML config has `search.default_top: 10` and `KB_DEFAULT_TOP=20` is set
- **THEN** the default top value is 20
### Requirement: Configuration precedence
The system SHALL apply configuration in this order (highest to lowest precedence): CLI flags, environment variables, YAML config file, built-in defaults.
#### Scenario: CLI flag overrides everything
- **WHEN** YAML config has `search.default_top: 10`, ENV has `KB_DEFAULT_TOP=20`, and user runs `kb search "test" --top 5`
- **THEN** 5 results are returned
### Requirement: View and set configuration
The system SHALL support viewing the current effective configuration via `kb config` and setting individual values via `kb config set <key> <value>`.
#### Scenario: View configuration
- **WHEN** user runs `kb config`
- **THEN** the system displays the fully resolved configuration (defaults merged with YAML merged with ENV), indicating the source of each value
#### Scenario: Set a config value
- **WHEN** user runs `kb config set chunking.pdf.max_tokens 2048`
- **THEN** the value is written to `~/.kb/config.yaml`, creating the file if necessary
### Requirement: Configurable chunking parameters
The system SHALL support per-document-type chunking configuration with sensible defaults.
#### Scenario: Default chunking for PDF
- **WHEN** no chunking config is specified for PDF
- **THEN** the system uses `strategy: hierarchy, max_tokens: 1024`
#### Scenario: Default chunking for markdown
- **WHEN** no chunking config is specified for markdown
- **THEN** the system uses `strategy: header, min_tokens: 50, max_tokens: 1024`
#### Scenario: Default chunking for code
- **WHEN** no chunking config is specified for code
- **THEN** the system uses `strategy: ast, include_context: true, max_tokens: 1024`
#### Scenario: Default chunking for notes
- **WHEN** no chunking config is specified for notes
- **THEN** the system uses `strategy: whole`
#### Scenario: Custom chunking overrides
- **WHEN** YAML config specifies `chunking.pdf.strategy: fixed` and `chunking.pdf.max_tokens: 512`
- **THEN** PDFs are chunked with fixed-size windows of 512 tokens instead of hierarchy-aware chunking