2d179af557
The Go client struct expected a nested document object and top-level page/section fields, but the engine returns flat results with metadata in chunk_metadata. This caused empty display for title, type, tags, page, and section in human output mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
169 lines
5.5 KiB
Markdown
169 lines
5.5 KiB
Markdown
# kb-search skill
|
|
|
|
Search, manage, and add to the user's personal knowledge base containing PDFs, Word docs, HTML, markdown, code files, and text notes.
|
|
|
|
## When to use
|
|
|
|
- User asks a question that might be answered by their stored documents, notes, or code
|
|
- User explicitly says "check my notes", "search kb", "look in my knowledge base", "what do my docs say about..."
|
|
- User references documents or notes they've previously stored
|
|
- User asks "how do I..." style questions that their knowledge base likely covers
|
|
- User wants to save a note, add a file, or manage their knowledge base
|
|
|
|
## Adding notes
|
|
|
|
```bash
|
|
kb addnote "remember to update DNS records" # add a note
|
|
kb addnote "server room is building 3, floor 2" --tags ops # add a tagged note
|
|
```
|
|
|
|
The note text must be a single quoted argument.
|
|
|
|
## Search (primary use case)
|
|
|
|
```bash
|
|
kb search "<query>" --top 10 --format json
|
|
```
|
|
|
|
Returns JSON with ranked results combining full-text and semantic search.
|
|
|
|
**Flags:**
|
|
- `-n, --top N` — number of results (default: 10)
|
|
- `--tags tag1,tag2` — filter by tags (AND logic)
|
|
- `--type pdf|markdown|code|note` — filter by document type
|
|
- `--format json|human` — output format (always use json for parsing)
|
|
- `--fts-only` — keyword search only (skip semantic)
|
|
- `--vec-only` — semantic search only (skip keyword)
|
|
- `--threshold FLOAT` — minimum score cutoff
|
|
|
|
## Adding files
|
|
|
|
```bash
|
|
kb addfile report.pdf # single file
|
|
kb addfile report.pdf --tags admin,reference # with tags
|
|
kb addfile ~/docs/ --recursive # directory (recursive)
|
|
kb addfile ~/docs/ --recursive --tags reference # directory with tags
|
|
```
|
|
|
|
Supported file types: `.pdf`, `.docx`, `.html`, `.md`, `.txt`, `.py`, `.sh`, `.go`. Unsupported extensions are rejected before upload.
|
|
|
|
**Flags:**
|
|
- `--tags tag1,tag2` — tags (comma-separated)
|
|
- `-r, --recursive` — recursively add directory contents
|
|
|
|
## Document management
|
|
|
|
```bash
|
|
kb list --format json # list all documents
|
|
kb list --type pdf --format json # filter by type
|
|
kb list --tags admin --format json # filter by tags
|
|
kb info <doc_id> --format json # document details with chunks
|
|
kb export <doc_id> -o file.pdf # download original file
|
|
kb remove <doc_id> # remove (prompts for confirmation)
|
|
kb remove <doc_id> --yes # remove without confirmation
|
|
```
|
|
|
|
## Tag management
|
|
|
|
```bash
|
|
kb tags --format json # list all tags with counts
|
|
kb tag <doc_id> --add important,ops # add tags to a document
|
|
kb tag <doc_id> --remove draft # remove tags from a document
|
|
```
|
|
|
|
## Jobs (ingestion queue)
|
|
|
|
```bash
|
|
kb jobs --format json # list recent jobs
|
|
kb jobs --status failed --format json # filter by status
|
|
kb jobs <job_id> --format json # job details
|
|
```
|
|
|
|
## Examples
|
|
|
|
```bash
|
|
kb examples # show common usage examples
|
|
```
|
|
|
|
## Engine status and maintenance
|
|
|
|
```bash
|
|
kb status --format json # engine status, GPU info, DB stats
|
|
kb reindex --yes # re-embed all chunks (skip confirmation)
|
|
```
|
|
|
|
## Global flags
|
|
|
|
All commands support:
|
|
- `--format json|human` — output format (always use `json` for machine parsing)
|
|
- `--engine <url>` — engine API URL (default: http://localhost:8000)
|
|
- `--api-key <key>` — API key for authentication
|
|
|
|
## Search output format
|
|
|
|
```json
|
|
{
|
|
"query": "how to install git",
|
|
"results": [
|
|
{
|
|
"chunk_id": 1423,
|
|
"score": 0.031,
|
|
"text": "To install the latest version of git from source...",
|
|
"chunk_index": 3,
|
|
"chunk_metadata": {"page": 12},
|
|
"title": "Git Admin Guide",
|
|
"doc_type": "pdf",
|
|
"source_path": "/home/user/docs/git-admin.pdf",
|
|
"created_at": "2026-03-15T10:30:00",
|
|
"tags": ["git", "admin"]
|
|
}
|
|
],
|
|
"total_matches": 47,
|
|
"returned": 10
|
|
}
|
|
```
|
|
|
|
## How to answer search queries
|
|
|
|
1. Run `kb search "<query>" --top 10 --format json`
|
|
2. Read the returned chunks
|
|
3. Synthesise a natural language answer from the top results
|
|
4. **ALWAYS cite sources**: "According to [title] (p.X)..." or "From [title], section [header]..."
|
|
5. If results have low scores (all below 0.01) or `returned: 0`, tell the user: "I couldn't find anything in your knowledge base about this"
|
|
6. If initial results seem off-target, try refining the query and searching again
|
|
|
|
## Multi-query strategy
|
|
|
|
For complex questions, search multiple times with different queries:
|
|
|
|
- Decompose the question into sub-queries
|
|
- Run each query separately
|
|
- Combine and deduplicate results across queries
|
|
- Synthesise a unified answer citing all relevant sources
|
|
|
|
Example:
|
|
```
|
|
User: "What's the difference between git rebase and merge?"
|
|
|
|
Query 1: kb search "git rebase explanation" --top 5 --format json
|
|
Query 2: kb search "git merge explanation" --top 5 --format json
|
|
Query 3: kb search "git rebase vs merge" --top 5 --format json
|
|
```
|
|
|
|
## Filtering tips
|
|
|
|
Use filters when the question implies a specific domain:
|
|
|
|
- Code question → `--type code`
|
|
- From a specific topic → `--tags <topic>`
|
|
- Check available tags first: `kb tags --format json`
|
|
|
|
## Important notes
|
|
|
|
- Always use `--format json` for machine parsing
|
|
- The `score` field is relative, not absolute — compare scores within a result set
|
|
- `chunk_metadata.page` is only present for PDF documents
|
|
- `chunk_metadata.section_header` is only present for markdown documents with headers
|
|
- Results are already ranked by relevance (hybrid FTS + vector search)
|
|
- Duplicate files are detected at upload time (HTTP 409) — the client handles this gracefully
|