Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8.1 KiB
kb-search skill
Search, manage, and add to the user's personal knowledge base containing PDFs, Word docs, HTML, markdown, code files, and text notes.
When to use
- User asks a question that might be answered by their stored documents, notes, or code
- User explicitly says "check my notes", "search kb", "look in my knowledge base", "what do my docs say about..."
- User references documents or notes they've previously stored
- User asks "how do I..." style questions that their knowledge base likely covers
- User wants to save a note, add a file, or manage their knowledge base
Adding notes
kb addnote "remember to update DNS records" # add a note
kb addnote "server room is building 3, floor 2" --tags ops # add a tagged note
The note text must be a single quoted argument.
Search (primary use case)
kb search "<query>" --top 10 --format json
Returns JSON with ranked results combining full-text and semantic search.
Flags:
-n, --top N— number of results (default: 10)--tags tag1,tag2— filter by tags (AND logic)--type pdf|markdown|code|note— filter by document type--format json|human— output format (always use json for parsing)--fts-only— keyword search only (skip semantic)--vec-only— semantic search only (skip keyword)--threshold FLOAT— minimum score cutoff
Adding files
kb addfile report.pdf # single file
kb addfile report.pdf --tags admin,reference # with tags
kb addfile ~/docs/ --recursive # directory (recursive)
kb addfile ~/docs/ --recursive --tags reference # directory with tags
Supported file types: .pdf, .docx, .html, .md, .txt, .py, .sh, .go. Unsupported extensions are rejected before upload.
Flags:
--tags tag1,tag2— tags (comma-separated)-r, --recursive— recursively add directory contents
Document management
kb list --format json # list all documents
kb list --type pdf --format json # filter by type
kb list --tags admin --format json # filter by tags
kb info <doc_id> --format json # document details with chunks
kb export <doc_id> -o file.pdf # download original file
kb remove <doc_id> # remove (prompts for confirmation)
kb remove <doc_id> --yes # remove without confirmation
Tag management
kb tags --format json # list all tags with counts
kb tag <doc_id> --add important,ops # add tags to a document
kb tag <doc_id> --remove draft # remove tags from a document
Bulk operations
Operate on multiple documents at once using filter-based selection. Filters combine with AND logic.
Filter flags (shared across all bulk commands):
--tags tag1,tag2— match documents with ALL specified tags--type pdf|note|...— match by document type--ids 1,5,12— match specific document IDs--from-id N— match documents with id >= N--to-id N— match documents with id <= N--force/-f— override safety threshold (blocks operations affecting >70% of all documents)--yes/-y— skip confirmation prompt
# Bulk delete
kb bulk-remove --tags "draft,old" --type note --yes # delete matching docs
kb bulk-remove --from-id 10 --to-id 50 --yes # delete by ID range
kb bulk-remove --ids "3,7,12" --yes # delete specific IDs
# Bulk tag add/remove
kb bulk-tag --tags "agent:mybot" --add "reviewed" --remove "pending" --yes
kb bulk-tag --type note --add "archived" --yes # tag all notes
# Bulk replace tags
kb bulk-set-tags --tags "old-scheme" --set "new-scheme,migrated" --yes
All bulk commands return a summary: matched count, succeeded count, failed count, and errors.
A safety threshold prevents accidentally affecting more than 70% of documents unless --force is used.
The threshold is configurable on the engine via KB_BULK_SAFETY_PERCENT (integer 0-100, default 70; 0 disables).
Jobs (ingestion queue)
kb jobs --format json # list recent jobs
kb jobs --status failed --format json # filter by status
kb jobs <job_id> --format json # job details
Examples
kb examples # show common usage examples
Engine status and maintenance
kb status --format json # engine status, GPU info, DB stats
kb reindex --yes # re-embed all chunks (skip confirmation)
Global flags
All commands support:
--format json|human— output format (always usejsonfor machine parsing)--engine <url>— engine API URL (default: http://localhost:8000)--api-key <key>— API key for authentication
Search output format
{
"query": "how to install git",
"results": [
{
"chunk_id": 1423,
"score": 0.031,
"text": "To install the latest version of git from source...",
"chunk_index": 3,
"chunk_metadata": {"page": 12},
"title": "Git Admin Guide",
"doc_type": "pdf",
"source_path": "/home/user/docs/git-admin.pdf",
"created_at": "2026-03-15T10:30:00",
"tags": ["git", "admin"]
}
],
"total_matches": 47,
"returned": 10
}
How to answer search queries
- Run
kb search "<query>" --top 10 --format json - Read the returned chunks
- Synthesise a natural language answer from the top results
- ALWAYS cite sources: "According to [title] (p.X)..." or "From [title], section [header]..."
- If results have low scores (all below 0.01) or
returned: 0, tell the user: "I couldn't find anything in your knowledge base about this" - If initial results seem off-target, try refining the query and searching again
Multi-query strategy
For complex questions, search multiple times with different queries:
- Decompose the question into sub-queries
- Run each query separately
- Combine and deduplicate results across queries
- Synthesise a unified answer citing all relevant sources
Example:
User: "What's the difference between git rebase and merge?"
Query 1: kb search "git rebase explanation" --top 5 --format json
Query 2: kb search "git merge explanation" --top 5 --format json
Query 3: kb search "git rebase vs merge" --top 5 --format json
Filtering tips
Use filters when the question implies a specific domain:
- Code question →
--type code - From a specific topic →
--tags <topic> - Check available tags first:
kb tags --format json
Updating notes
kb updatenote 42 "revised note content" # update note by ID
Updates the text of an existing note in place, preserving its ID, creation timestamp, and tags. Re-chunks and re-embeds the new text.
MCP server (agent integration)
For agent-to-agent integration, kb provides an MCP server alongside the CLI. The MCP server exposes the same operations as native MCP tools over Streamable HTTP transport, which agents can connect to directly without subprocess overhead.
MCP tools: kb_search, kb_addnote, kb_update_note, kb_get, kb_delete, kb_status,
kb_jobs, kb_upload_start, kb_upload_chunk, kb_upload_finish, kb_bulk_delete,
kb_bulk_tags, kb_bulk_set_tags.
Use tags to separate agent data from user documents (e.g. tag all agent notes with
agent:mybot and filter by that tag when searching). This convention is communicated
via system prompt — no special server-side enforcement needed.
If the kb engine is already running via Docker Compose, add the MCP server by deploying the
kb-mcp service from the same compose file. Agents connect to it on port 3000 (default).
Important notes
- Always use
--format jsonfor machine parsing - The
scorefield is relative, not absolute — compare scores within a result set chunk_metadata.pageis only present for PDF documentschunk_metadata.section_headeris only present for markdown documents with headers- Results are already ranked by relevance (hybrid FTS + vector search)
- Duplicate files are detected at upload time (HTTP 409) — the client handles this gracefully