Files
kb/openspec/changes/kb-search/specs/document-management/spec.md
T
2026-03-23 20:38:42 +00:00

4.2 KiB

ADDED Requirements

Requirement: List documents

The system SHALL list all indexed documents via kb list. Results SHALL include document ID, title, type, tag count, chunk count, and creation date. Output SHALL support --format json and --format human.

Scenario: List all documents

  • WHEN user runs kb list
  • THEN all documents are listed with their ID, title, type, tags, chunk count, and creation date

Scenario: Filter by type

  • WHEN user runs kb list --type pdf
  • THEN only PDF documents are listed

Scenario: Filter by tags

  • WHEN user runs kb list --tags admin,ops
  • THEN only documents tagged with BOTH "admin" AND "ops" are listed

Scenario: Empty database

  • WHEN user runs kb list with no documents indexed
  • THEN the system prints "No documents indexed. Run kb add to get started." and exits with zero status

Requirement: Document info

The system SHALL display detailed information about a single document via kb info <doc_id>, including all metadata, tags, chunk count, and chunk previews (first 100 characters of each chunk).

Scenario: View document info

  • WHEN user runs kb info 42
  • THEN the system displays: title, source path, type, language (if code), content hash, creation date, tags, total chunks, and a preview of each chunk

Scenario: Invalid document ID

  • WHEN user runs kb info 9999 and no document with ID 9999 exists
  • THEN the system prints "Document not found: 9999" and exits with non-zero status

Requirement: Remove document

The system SHALL remove a document and all its associated chunks, embeddings, and tag associations via kb remove <doc_id>. The system SHALL ask for confirmation before deletion unless --yes is passed.

Scenario: Remove with confirmation

  • WHEN user runs kb remove 42
  • THEN the system displays the document title and asks "Remove 'Git Admin Guide' and its 28 chunks? [y/N]". On confirmation, the document, its chunks, FTS entries, vector embeddings, and tag associations are deleted.

Scenario: Remove with --yes flag

  • WHEN user runs kb remove 42 --yes
  • THEN the document is removed without confirmation prompt

Scenario: Cascading delete

  • WHEN a document is removed
  • THEN all rows in chunks, chunks_fts, chunks_vec, and document_tags referencing that document SHALL be deleted

Requirement: Tag management

The system SHALL support adding and removing tags on documents via kb tag <doc_id> --add tag1,tag2 and kb tag <doc_id> --remove tag1. Tags are case-insensitive and stored lowercase. The system SHALL list all tags with document counts via kb tags.

Scenario: Add tags to a document

  • WHEN user runs kb tag 42 --add git,admin
  • THEN the tags "git" and "admin" are associated with document 42. Tags are created if they don't exist.

Scenario: Remove a tag from a document

  • WHEN user runs kb tag 42 --remove admin
  • THEN the "admin" tag association is removed from document 42. The tag itself remains in the tags table if other documents use it.

Scenario: List all tags

  • WHEN user runs kb tags
  • THEN the system lists all tags with the count of documents using each tag, sorted by count descending

Scenario: Tag on ingestion

  • WHEN user runs kb add report.pdf --tags compliance,q1
  • THEN the document is ingested and immediately tagged with "compliance" and "q1"

Scenario: Tags in JSON format

  • WHEN user runs kb tags --format json
  • THEN output is a JSON array of objects: [{"name": "git", "count": 15}, ...]

Requirement: Database status

The system SHALL report database statistics via kb status, including: total documents (by type), total chunks, database file size, active model name and dimension, and schema version.

Scenario: Show status

  • WHEN user runs kb status
  • THEN the system displays: document counts by type, total chunks, DB file size, model name, embedding dimension, and schema version

Scenario: Status before init

  • WHEN user runs kb status before kb init
  • THEN the system prints "Knowledge base not initialised. Run kb init first." and exits with non-zero status