## ADDED Requirements ### Requirement: List documents The system SHALL list all indexed documents via `kb list`. Results SHALL include document ID, title, type, tag count, chunk count, and creation date. Output SHALL support `--format json` and `--format human`. #### Scenario: List all documents - **WHEN** user runs `kb list` - **THEN** all documents are listed with their ID, title, type, tags, chunk count, and creation date #### Scenario: Filter by type - **WHEN** user runs `kb list --type pdf` - **THEN** only PDF documents are listed #### Scenario: Filter by tags - **WHEN** user runs `kb list --tags admin,ops` - **THEN** only documents tagged with BOTH "admin" AND "ops" are listed #### Scenario: Empty database - **WHEN** user runs `kb list` with no documents indexed - **THEN** the system prints "No documents indexed. Run `kb add` to get started." and exits with zero status ### Requirement: Document info The system SHALL display detailed information about a single document via `kb info `, including all metadata, tags, chunk count, and chunk previews (first 100 characters of each chunk). #### Scenario: View document info - **WHEN** user runs `kb info 42` - **THEN** the system displays: title, source path, type, language (if code), content hash, creation date, tags, total chunks, and a preview of each chunk #### Scenario: Invalid document ID - **WHEN** user runs `kb info 9999` and no document with ID 9999 exists - **THEN** the system prints "Document not found: 9999" and exits with non-zero status ### Requirement: Remove document The system SHALL remove a document and all its associated chunks, embeddings, and tag associations via `kb remove `. The system SHALL ask for confirmation before deletion unless `--yes` is passed. #### Scenario: Remove with confirmation - **WHEN** user runs `kb remove 42` - **THEN** the system displays the document title and asks "Remove 'Git Admin Guide' and its 28 chunks? [y/N]". On confirmation, the document, its chunks, FTS entries, vector embeddings, and tag associations are deleted. #### Scenario: Remove with --yes flag - **WHEN** user runs `kb remove 42 --yes` - **THEN** the document is removed without confirmation prompt #### Scenario: Cascading delete - **WHEN** a document is removed - **THEN** all rows in `chunks`, `chunks_fts`, `chunks_vec`, and `document_tags` referencing that document SHALL be deleted ### Requirement: Tag management The system SHALL support adding and removing tags on documents via `kb tag --add tag1,tag2` and `kb tag --remove tag1`. Tags are case-insensitive and stored lowercase. The system SHALL list all tags with document counts via `kb tags`. #### Scenario: Add tags to a document - **WHEN** user runs `kb tag 42 --add git,admin` - **THEN** the tags "git" and "admin" are associated with document 42. Tags are created if they don't exist. #### Scenario: Remove a tag from a document - **WHEN** user runs `kb tag 42 --remove admin` - **THEN** the "admin" tag association is removed from document 42. The tag itself remains in the tags table if other documents use it. #### Scenario: List all tags - **WHEN** user runs `kb tags` - **THEN** the system lists all tags with the count of documents using each tag, sorted by count descending #### Scenario: Tag on ingestion - **WHEN** user runs `kb add report.pdf --tags compliance,q1` - **THEN** the document is ingested and immediately tagged with "compliance" and "q1" #### Scenario: Tags in JSON format - **WHEN** user runs `kb tags --format json` - **THEN** output is a JSON array of objects: `[{"name": "git", "count": 15}, ...]` ### Requirement: Database status The system SHALL report database statistics via `kb status`, including: total documents (by type), total chunks, database file size, active model name and dimension, and schema version. #### Scenario: Show status - **WHEN** user runs `kb status` - **THEN** the system displays: document counts by type, total chunks, DB file size, model name, embedding dimension, and schema version #### Scenario: Status before init - **WHEN** user runs `kb status` before `kb init` - **THEN** the system prints "Knowledge base not initialised. Run `kb init` first." and exits with non-zero status