Files
kb/SKILL.md
T
steve afbe270181 Replace implicit note shorthand with explicit addnote command and split README
Two changes:

1. structured-add-commands: The implicit note shorthand (kb "text") caused
   accidental note creation from mistyped commands. Replaced with explicit
   kb addnote <text> command. Root command reverts to standard Cobra
   behaviour. Updated examples, tests, SKILL.md, and specs.

2. split-readme-developer-docs: Moved build-from-source instructions, release
   process, API reference, and ROCm migration notes from README.md into a
   new DEVELOPER.md. README now links to DEVELOPER.md for dev workflows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 20:48:22 +01:00

167 lines
5.5 KiB
Markdown

# kb-search skill
Search, manage, and add to the user's personal knowledge base containing PDFs, Word docs, HTML, markdown, code files, and text notes.
## When to use
- User asks a question that might be answered by their stored documents, notes, or code
- User explicitly says "check my notes", "search kb", "look in my knowledge base", "what do my docs say about..."
- User references documents or notes they've previously stored
- User asks "how do I..." style questions that their knowledge base likely covers
- User wants to save a note, add a file, or manage their knowledge base
## Adding notes
```bash
kb addnote "remember to update DNS records" # add a note
kb addnote "server room is building 3, floor 2" --tags ops # add a tagged note
```
The note text must be a single quoted argument.
## Search (primary use case)
```bash
kb search "<query>" --top 10 --format json
```
Returns JSON with ranked results combining full-text and semantic search.
**Flags:**
- `-n, --top N` — number of results (default: 10)
- `--tags tag1,tag2` — filter by tags (AND logic)
- `--type pdf|markdown|code|note` — filter by document type
- `--format json|human` — output format (always use json for parsing)
- `--fts-only` — keyword search only (skip semantic)
- `--vec-only` — semantic search only (skip keyword)
- `--threshold FLOAT` — minimum score cutoff
## Adding files
```bash
kb addfile report.pdf # single file
kb addfile report.pdf --tags admin,reference # with tags
kb addfile ~/docs/ --recursive # directory (recursive)
kb addfile ~/docs/ --recursive --tags reference # directory with tags
```
Supported file types: `.pdf`, `.docx`, `.html`, `.md`, `.txt`, `.py`, `.sh`, `.go`. Unsupported extensions are rejected before upload.
**Flags:**
- `--tags tag1,tag2` — tags (comma-separated)
- `-r, --recursive` — recursively add directory contents
## Document management
```bash
kb list --format json # list all documents
kb list --type pdf --format json # filter by type
kb list --tags admin --format json # filter by tags
kb info <doc_id> --format json # document details with chunks
kb export <doc_id> -o file.pdf # download original file
kb remove <doc_id> # remove (prompts for confirmation)
kb remove <doc_id> --yes # remove without confirmation
```
## Tag management
```bash
kb tags --format json # list all tags with counts
kb tag <doc_id> --add important,ops # add tags to a document
kb tag <doc_id> --remove draft # remove tags from a document
```
## Jobs (ingestion queue)
```bash
kb jobs --format json # list recent jobs
kb jobs --status failed --format json # filter by status
kb jobs <job_id> --format json # job details
```
## Engine status and maintenance
```bash
kb status --format json # engine status, GPU info, DB stats
kb reindex --yes # re-embed all chunks (skip confirmation)
```
## Global flags
All commands support:
- `--format json|human` — output format (always use `json` for machine parsing)
- `--engine <url>` — engine API URL (default: http://localhost:8000)
- `--api-key <key>` — API key for authentication
## Search output format
```json
{
"query": "how to install git",
"results": [
{
"chunk_id": 1423,
"score": 0.031,
"score_breakdown": {"fts": 0.016, "vector": 0.015},
"text": "To install the latest version of git from source...",
"source": {
"document_id": 42,
"title": "Git Admin Guide",
"path": "/home/user/docs/git-admin.pdf",
"type": "pdf",
"page": 12,
"chunk_index": 3,
"total_chunks": 28,
"tags": ["git", "admin"]
}
}
],
"total_matches": 47,
"returned": 10
}
```
## How to answer search queries
1. Run `kb search "<query>" --top 10 --format json`
2. Read the returned chunks
3. Synthesise a natural language answer from the top results
4. **ALWAYS cite sources**: "According to [title] (p.X)..." or "From [title], section [header]..."
5. If results have low scores (all below 0.01) or `returned: 0`, tell the user: "I couldn't find anything in your knowledge base about this"
6. If initial results seem off-target, try refining the query and searching again
## Multi-query strategy
For complex questions, search multiple times with different queries:
- Decompose the question into sub-queries
- Run each query separately
- Combine and deduplicate results across queries
- Synthesise a unified answer citing all relevant sources
Example:
```
User: "What's the difference between git rebase and merge?"
Query 1: kb search "git rebase explanation" --top 5 --format json
Query 2: kb search "git merge explanation" --top 5 --format json
Query 3: kb search "git rebase vs merge" --top 5 --format json
```
## Filtering tips
Use filters when the question implies a specific domain:
- Code question → `--type code`
- From a specific topic → `--tags <topic>`
- Check available tags first: `kb tags --format json`
## Important notes
- Always use `--format json` for machine parsing
- The `score` field is relative, not absolute — compare scores within a result set
- `source.page` is only present for PDF documents
- `source.section_header` is only present for markdown documents with headers
- Results are already ranked by relevance (hybrid FTS + vector search)
- Duplicate files are detected at upload time (HTTP 409) — the client handles this gracefully