# kb-search skill Search, manage, and add to the user's personal knowledge base containing PDFs, Word docs, HTML, markdown, code files, and text notes. ## When to use - User asks a question that might be answered by their stored documents, notes, or code - User explicitly says "check my notes", "search kb", "look in my knowledge base", "what do my docs say about..." - User references documents or notes they've previously stored - User asks "how do I..." style questions that their knowledge base likely covers - User wants to save a note, add a file, or manage their knowledge base ## Quick notes ```bash kb "remember to update DNS records" # add a note kb "server room is building 3, floor 2" --tags ops # add a tagged note ``` Bare text without a subcommand is treated as a note and submitted for ingestion. ## Search (primary use case) ```bash kb search "" --top 10 --format json ``` Returns JSON with ranked results combining full-text and semantic search. **Flags:** - `-n, --top N` — number of results (default: 10) - `--tags tag1,tag2` — filter by tags (AND logic) - `--type pdf|markdown|code|note` — filter by document type - `--format json|human` — output format (always use json for parsing) - `--fts-only` — keyword search only (skip semantic) - `--vec-only` — semantic search only (skip keyword) - `--threshold FLOAT` — minimum score cutoff ## Adding files ```bash kb addfile report.pdf # single file kb addfile report.pdf --tags admin,reference # with tags kb addfile ~/docs/ --recursive # directory (recursive) kb addfile ~/docs/ --recursive --tags reference # directory with tags ``` Supported file types: `.pdf`, `.docx`, `.html`, `.md`, `.txt`, `.py`, `.sh`, `.go`. Unsupported extensions are rejected before upload. **Flags:** - `--tags tag1,tag2` — tags (comma-separated) - `-r, --recursive` — recursively add directory contents ## Document management ```bash kb list --format json # list all documents kb list --type pdf --format json # filter by type kb list --tags admin --format json # filter by tags kb info --format json # document details with chunks kb export -o file.pdf # download original file kb remove # remove (prompts for confirmation) kb remove --yes # remove without confirmation ``` ## Tag management ```bash kb tags --format json # list all tags with counts kb tag --add important,ops # add tags to a document kb tag --remove draft # remove tags from a document ``` ## Jobs (ingestion queue) ```bash kb jobs --format json # list recent jobs kb jobs --status failed --format json # filter by status kb jobs --format json # job details ``` ## Engine status and maintenance ```bash kb status --format json # engine status, GPU info, DB stats kb reindex --yes # re-embed all chunks (skip confirmation) ``` ## Global flags All commands support: - `--format json|human` — output format (always use `json` for machine parsing) - `--engine ` — engine API URL (default: http://localhost:8000) - `--api-key ` — API key for authentication ## Search output format ```json { "query": "how to install git", "results": [ { "chunk_id": 1423, "score": 0.031, "score_breakdown": {"fts": 0.016, "vector": 0.015}, "text": "To install the latest version of git from source...", "source": { "document_id": 42, "title": "Git Admin Guide", "path": "/home/user/docs/git-admin.pdf", "type": "pdf", "page": 12, "chunk_index": 3, "total_chunks": 28, "tags": ["git", "admin"] } } ], "total_matches": 47, "returned": 10 } ``` ## How to answer search queries 1. Run `kb search "" --top 10 --format json` 2. Read the returned chunks 3. Synthesise a natural language answer from the top results 4. **ALWAYS cite sources**: "According to [title] (p.X)..." or "From [title], section [header]..." 5. If results have low scores (all below 0.01) or `returned: 0`, tell the user: "I couldn't find anything in your knowledge base about this" 6. If initial results seem off-target, try refining the query and searching again ## Multi-query strategy For complex questions, search multiple times with different queries: - Decompose the question into sub-queries - Run each query separately - Combine and deduplicate results across queries - Synthesise a unified answer citing all relevant sources Example: ``` User: "What's the difference between git rebase and merge?" Query 1: kb search "git rebase explanation" --top 5 --format json Query 2: kb search "git merge explanation" --top 5 --format json Query 3: kb search "git rebase vs merge" --top 5 --format json ``` ## Filtering tips Use filters when the question implies a specific domain: - Code question → `--type code` - From a specific topic → `--tags ` - Check available tags first: `kb tags --format json` ## Important notes - Always use `--format json` for machine parsing - The `score` field is relative, not absolute — compare scores within a result set - `source.page` is only present for PDF documents - `source.section_header` is only present for markdown documents with headers - Results are already ranked by relevance (hybrid FTS + vector search) - Duplicate files are detected at upload time (HTTP 409) — the client handles this gracefully