Files
kb/SKILL.md
T
steve 2d179af557 Fix search human-mode output to match engine API response
The Go client struct expected a nested document object and top-level
page/section fields, but the engine returns flat results with metadata
in chunk_metadata. This caused empty display for title, type, tags,
page, and section in human output mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:17:35 +01:00

169 lines
5.5 KiB
Markdown

# kb-search skill
Search, manage, and add to the user's personal knowledge base containing PDFs, Word docs, HTML, markdown, code files, and text notes.
## When to use
- User asks a question that might be answered by their stored documents, notes, or code
- User explicitly says "check my notes", "search kb", "look in my knowledge base", "what do my docs say about..."
- User references documents or notes they've previously stored
- User asks "how do I..." style questions that their knowledge base likely covers
- User wants to save a note, add a file, or manage their knowledge base
## Adding notes
```bash
kb addnote "remember to update DNS records" # add a note
kb addnote "server room is building 3, floor 2" --tags ops # add a tagged note
```
The note text must be a single quoted argument.
## Search (primary use case)
```bash
kb search "<query>" --top 10 --format json
```
Returns JSON with ranked results combining full-text and semantic search.
**Flags:**
- `-n, --top N` — number of results (default: 10)
- `--tags tag1,tag2` — filter by tags (AND logic)
- `--type pdf|markdown|code|note` — filter by document type
- `--format json|human` — output format (always use json for parsing)
- `--fts-only` — keyword search only (skip semantic)
- `--vec-only` — semantic search only (skip keyword)
- `--threshold FLOAT` — minimum score cutoff
## Adding files
```bash
kb addfile report.pdf # single file
kb addfile report.pdf --tags admin,reference # with tags
kb addfile ~/docs/ --recursive # directory (recursive)
kb addfile ~/docs/ --recursive --tags reference # directory with tags
```
Supported file types: `.pdf`, `.docx`, `.html`, `.md`, `.txt`, `.py`, `.sh`, `.go`. Unsupported extensions are rejected before upload.
**Flags:**
- `--tags tag1,tag2` — tags (comma-separated)
- `-r, --recursive` — recursively add directory contents
## Document management
```bash
kb list --format json # list all documents
kb list --type pdf --format json # filter by type
kb list --tags admin --format json # filter by tags
kb info <doc_id> --format json # document details with chunks
kb export <doc_id> -o file.pdf # download original file
kb remove <doc_id> # remove (prompts for confirmation)
kb remove <doc_id> --yes # remove without confirmation
```
## Tag management
```bash
kb tags --format json # list all tags with counts
kb tag <doc_id> --add important,ops # add tags to a document
kb tag <doc_id> --remove draft # remove tags from a document
```
## Jobs (ingestion queue)
```bash
kb jobs --format json # list recent jobs
kb jobs --status failed --format json # filter by status
kb jobs <job_id> --format json # job details
```
## Examples
```bash
kb examples # show common usage examples
```
## Engine status and maintenance
```bash
kb status --format json # engine status, GPU info, DB stats
kb reindex --yes # re-embed all chunks (skip confirmation)
```
## Global flags
All commands support:
- `--format json|human` — output format (always use `json` for machine parsing)
- `--engine <url>` — engine API URL (default: http://localhost:8000)
- `--api-key <key>` — API key for authentication
## Search output format
```json
{
"query": "how to install git",
"results": [
{
"chunk_id": 1423,
"score": 0.031,
"text": "To install the latest version of git from source...",
"chunk_index": 3,
"chunk_metadata": {"page": 12},
"title": "Git Admin Guide",
"doc_type": "pdf",
"source_path": "/home/user/docs/git-admin.pdf",
"created_at": "2026-03-15T10:30:00",
"tags": ["git", "admin"]
}
],
"total_matches": 47,
"returned": 10
}
```
## How to answer search queries
1. Run `kb search "<query>" --top 10 --format json`
2. Read the returned chunks
3. Synthesise a natural language answer from the top results
4. **ALWAYS cite sources**: "According to [title] (p.X)..." or "From [title], section [header]..."
5. If results have low scores (all below 0.01) or `returned: 0`, tell the user: "I couldn't find anything in your knowledge base about this"
6. If initial results seem off-target, try refining the query and searching again
## Multi-query strategy
For complex questions, search multiple times with different queries:
- Decompose the question into sub-queries
- Run each query separately
- Combine and deduplicate results across queries
- Synthesise a unified answer citing all relevant sources
Example:
```
User: "What's the difference between git rebase and merge?"
Query 1: kb search "git rebase explanation" --top 5 --format json
Query 2: kb search "git merge explanation" --top 5 --format json
Query 3: kb search "git rebase vs merge" --top 5 --format json
```
## Filtering tips
Use filters when the question implies a specific domain:
- Code question → `--type code`
- From a specific topic → `--tags <topic>`
- Check available tags first: `kb tags --format json`
## Important notes
- Always use `--format json` for machine parsing
- The `score` field is relative, not absolute — compare scores within a result set
- `chunk_metadata.page` is only present for PDF documents
- `chunk_metadata.section_header` is only present for markdown documents with headers
- Results are already ranked by relevance (hybrid FTS + vector search)
- Duplicate files are detected at upload time (HTTP 409) — the client handles this gracefully