4.1 KiB
ADDED Requirements
Requirement: Model initialisation
The system SHALL download the embedding model on kb init. The default model SHALL be all-MiniLM-L6-v2. The user MAY specify a different model via kb init --model <name>. The model SHALL be downloaded via sentence-transformers to the HuggingFace default cache (~/.cache/huggingface/). On first load, the model SHALL be exported to ONNX format for inference.
Scenario: Default init
- WHEN user runs
kb init - THEN the system downloads
all-MiniLM-L6-v2, creates~/.kb/kb.dbwith the schema, and recordsmodel_name=all-MiniLM-L6-v2andembedding_dim=384in the DB config table
Scenario: Init with custom model
- WHEN user runs
kb init --model nomic-embed-text - THEN the system downloads
nomic-embed-text, creates the database, and records the model name and its dimension in the DB config table
Scenario: Init status check
- WHEN user runs
kb init --status - THEN the system reports: whether
~/.kb/exists, whether the DB is initialised, which model is configured, whether the model is downloaded, and Docling model status
Scenario: ONNX export on first load
- WHEN the embedding model is loaded for the first time after download
- THEN the system SHALL display "Optimising model for ONNX inference (one-time)..." and export the model to ONNX format. Subsequent loads SHALL use the cached ONNX export.
Requirement: Model-database binding
The system SHALL store the active model name and embedding dimension in the database config table. Every operation that uses the embedding model (add, search, reindex) SHALL verify that the loaded model matches the DB record. A mismatch SHALL be a hard error.
Scenario: Model mismatch on add
- WHEN user runs
kb add doc.pdfbut the config YAML specifies a different model than what the DB was initialised with - THEN the system SHALL print an error: "Model mismatch: DB uses 'all-MiniLM-L6-v2' (384 dim) but config specifies 'nomic-embed-text'. Run
kb reindex --model nomic-embed-textto switch models." and exit with non-zero status
Scenario: Model match on add
- WHEN user runs
kb add doc.pdfand the config model matches the DB model - THEN ingestion proceeds normally
Requirement: Full reindex with model switching
The system SHALL support re-embedding all chunks via kb reindex. If --model is specified, the system SHALL download the new model, re-embed all chunks, replace all vectors, and update the DB config. A progress bar SHALL be displayed. The operation SHALL be atomic — if interrupted, the old embeddings remain intact.
Scenario: Reindex with same model
- WHEN user runs
kb reindex - THEN all chunks are re-embedded with the current model and vectors are replaced. Useful if the model's ONNX export was corrupted or chunks were modified.
Scenario: Reindex with new model
- WHEN user runs
kb reindex --model bge-small-en-v1.5 - THEN the system downloads the new model, re-embeds all chunks (showing progress), replaces all vectors in
chunks_vec(recreating the table if dimension changed), and updatesmodel_nameandembedding_dimin the DB config table
Scenario: Interrupted reindex
- WHEN a reindex is interrupted partway through
- THEN the old embeddings remain intact (the vector table is only replaced on successful completion of all embeddings). The user can rerun
kb reindexto retry.
Requirement: Embedding model inference via ONNX
The system SHALL use sentence-transformers with the ONNX backend for all embedding inference. This avoids a PyTorch dependency. The ONNX Runtime (onnxruntime) SHALL be the inference engine.
Scenario: Embed a chunk
- WHEN a chunk of text needs to be embedded during ingestion
- THEN the system uses the sentence-transformers ONNX backend to produce a float vector of the correct dimension for the active model
Scenario: Embed a query
- WHEN a search query needs to be embedded
- THEN the system applies the configured
query_prefix(if any) to the query text before embedding, and uses the same ONNX model used for chunk embeddings