Initial MVP

This commit is contained in:
2026-03-23 20:38:42 +00:00
commit f245c24928
57 changed files with 6812 additions and 0 deletions
+53
View File
@@ -0,0 +1,53 @@
# kb-search
CLI knowledge base with hybrid search (full-text + semantic vector search).
## Install
```bash
pipx install kb-search
```
## Quickstart
```bash
# Initialise (downloads embedding model ~90MB)
kb init
# Add documents
kb add ~/docs/manual.pdf --tags admin
kb add ~/notes/ --recursive
kb add --note "Always restart nginx after config changes" --tags ops
# Search
kb search "how to install git"
kb search "deploy process" --tags ops --type pdf
kb search "authentication" --format human
# Manage
kb list --format human
kb tags
kb status
```
## How it works
- **Ingestion**: Documents are chunked (PDFs via Docling, markdown by headers, code by AST/functions) and embedded locally
- **Storage**: Everything in a single SQLite database (`~/.kb/kb.db`) using FTS5 for keyword search and sqlite-vec for vector search
- **Search**: Hybrid retrieval combining BM25 keyword scoring and vector similarity via Reciprocal Rank Fusion
- **Output**: JSON (for LLM tool use) or human-readable terminal format
## Configuration
Optional YAML config at `~/.kb/config.yaml`. Works with zero configuration.
```bash
kb config # View current config
kb config set chunking.pdf.max_tokens 2048 # Change a value
```
ENV overrides: `KB_DATA_DIR`, `KB_MODEL`, `KB_DEFAULT_TOP`, `KB_DEFAULT_FORMAT`
## Claude Code Skill
This tool is designed to be wrapped as a Claude Code skill. See `SKILL.md` for the skill definition.