## ADDED Requirements ### Requirement: YAML configuration file The system SHALL read configuration from `~/.kb/config.yaml`. If the file does not exist, the system SHALL use built-in defaults. The configuration file SHALL be optional — the tool MUST work with zero configuration. #### Scenario: No config file - **WHEN** `~/.kb/config.yaml` does not exist - **THEN** the system uses built-in defaults for all settings and operates normally #### Scenario: Partial config file - **WHEN** `~/.kb/config.yaml` exists but only specifies `chunking.pdf.max_tokens: 2048` - **THEN** the system uses built-in defaults for all other settings, overriding only `chunking.pdf.max_tokens` #### Scenario: Invalid config file - **WHEN** `~/.kb/config.yaml` contains invalid YAML - **THEN** the system prints a clear error message identifying the YAML syntax issue and exits with non-zero status ### Requirement: Environment variable overrides The system SHALL support environment variable overrides with the prefix `KB_`. ENV variables SHALL take precedence over the YAML config file. Supported variables: `KB_DATA_DIR`, `KB_MODEL`, `KB_DEFAULT_TOP`, `KB_DEFAULT_FORMAT`. #### Scenario: Override data directory - **WHEN** `KB_DATA_DIR=/tmp/test-kb` is set - **THEN** the system uses `/tmp/test-kb/` instead of `~/.kb/` for the database and config #### Scenario: Override model - **WHEN** `KB_MODEL=nomic-embed-text` is set - **THEN** the system uses `nomic-embed-text` as the embedding model, overriding the YAML config #### Scenario: ENV overrides YAML - **WHEN** YAML config has `search.default_top: 10` and `KB_DEFAULT_TOP=20` is set - **THEN** the default top value is 20 ### Requirement: Configuration precedence The system SHALL apply configuration in this order (highest to lowest precedence): CLI flags, environment variables, YAML config file, built-in defaults. #### Scenario: CLI flag overrides everything - **WHEN** YAML config has `search.default_top: 10`, ENV has `KB_DEFAULT_TOP=20`, and user runs `kb search "test" --top 5` - **THEN** 5 results are returned ### Requirement: View and set configuration The system SHALL support viewing the current effective configuration via `kb config` and setting individual values via `kb config set `. #### Scenario: View configuration - **WHEN** user runs `kb config` - **THEN** the system displays the fully resolved configuration (defaults merged with YAML merged with ENV), indicating the source of each value #### Scenario: Set a config value - **WHEN** user runs `kb config set chunking.pdf.max_tokens 2048` - **THEN** the value is written to `~/.kb/config.yaml`, creating the file if necessary ### Requirement: Configurable chunking parameters The system SHALL support per-document-type chunking configuration with sensible defaults. #### Scenario: Default chunking for PDF - **WHEN** no chunking config is specified for PDF - **THEN** the system uses `strategy: hierarchy, max_tokens: 1024` #### Scenario: Default chunking for markdown - **WHEN** no chunking config is specified for markdown - **THEN** the system uses `strategy: header, min_tokens: 50, max_tokens: 1024` #### Scenario: Default chunking for code - **WHEN** no chunking config is specified for code - **THEN** the system uses `strategy: ast, include_context: true, max_tokens: 1024` #### Scenario: Default chunking for notes - **WHEN** no chunking config is specified for notes - **THEN** the system uses `strategy: whole` #### Scenario: Custom chunking overrides - **WHEN** YAML config specifies `chunking.pdf.strategy: fixed` and `chunking.pdf.max_tokens: 512` - **THEN** PDFs are chunked with fixed-size windows of 512 tokens instead of hierarchy-aware chunking