steve/kb

Files

T

steve e7136a4a20 Add MCP server, note mutation endpoint, and updated_at tracking (v3.0.0)

New MCP server (mcp/) exposes kb operations as native MCP tools over
Streamable HTTP with Bearer token auth. Supports collections via tag
conventions, chunked file uploads, and agent-side search patterns.

Engine gains PATCH /api/v1/notes/{id} for in-place note updates with
transactional re-chunk/re-embed, and updated_at column on documents.

Go client adds updatenote command and Patch HTTP method.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-02 21:34:55 +01:00

6.6 KiB

Raw Blame History

kb-search skill

Search, manage, and add to the user's personal knowledge base containing PDFs, Word docs, HTML, markdown, code files, and text notes.

When to use

User asks a question that might be answered by their stored documents, notes, or code
User explicitly says "check my notes", "search kb", "look in my knowledge base", "what do my docs say about..."
User references documents or notes they've previously stored
User asks "how do I..." style questions that their knowledge base likely covers
User wants to save a note, add a file, or manage their knowledge base

Adding notes

kb addnote "remember to update DNS records"                # add a note
kb addnote "server room is building 3, floor 2" --tags ops # add a tagged note

The note text must be a single quoted argument.

Search (primary use case)

kb search "<query>" --top 10 --format json

Returns JSON with ranked results combining full-text and semantic search.

Flags:

-n, --top N — number of results (default: 10)
--tags tag1,tag2 — filter by tags (AND logic)
--type pdf|markdown|code|note — filter by document type
--format json|human — output format (always use json for parsing)
--fts-only — keyword search only (skip semantic)
--vec-only — semantic search only (skip keyword)
--threshold FLOAT — minimum score cutoff

Adding files

kb addfile report.pdf                           # single file
kb addfile report.pdf --tags admin,reference    # with tags
kb addfile ~/docs/ --recursive                  # directory (recursive)
kb addfile ~/docs/ --recursive --tags reference # directory with tags

Supported file types: .pdf, .docx, .html, .md, .txt, .py, .sh, .go. Unsupported extensions are rejected before upload.

Flags:

--tags tag1,tag2 — tags (comma-separated)
-r, --recursive — recursively add directory contents

Document management

kb list --format json                    # list all documents
kb list --type pdf --format json         # filter by type
kb list --tags admin --format json       # filter by tags
kb info <doc_id> --format json           # document details with chunks
kb export <doc_id> -o file.pdf           # download original file
kb remove <doc_id>                       # remove (prompts for confirmation)
kb remove <doc_id> --yes                 # remove without confirmation

Tag management

kb tags --format json                    # list all tags with counts
kb tag <doc_id> --add important,ops      # add tags to a document
kb tag <doc_id> --remove draft           # remove tags from a document

Jobs (ingestion queue)

kb jobs --format json                    # list recent jobs
kb jobs --status failed --format json    # filter by status
kb jobs <job_id> --format json           # job details

Examples

kb examples                              # show common usage examples

Engine status and maintenance

kb status --format json                  # engine status, GPU info, DB stats
kb reindex --yes                         # re-embed all chunks (skip confirmation)

Global flags

All commands support:

--format json|human — output format (always use json for machine parsing)
--engine <url> — engine API URL (default: http://localhost:8000)
--api-key <key> — API key for authentication

Search output format

{
  "query": "how to install git",
  "results": [
    {
      "chunk_id": 1423,
      "score": 0.031,
      "text": "To install the latest version of git from source...",
      "chunk_index": 3,
      "chunk_metadata": {"page": 12},
      "title": "Git Admin Guide",
      "doc_type": "pdf",
      "source_path": "/home/user/docs/git-admin.pdf",
      "created_at": "2026-03-15T10:30:00",
      "tags": ["git", "admin"]
    }
  ],
  "total_matches": 47,
  "returned": 10
}

How to answer search queries

Run kb search "<query>" --top 10 --format json
Read the returned chunks
Synthesise a natural language answer from the top results
ALWAYS cite sources: "According to [title] (p.X)..." or "From [title], section [header]..."
If results have low scores (all below 0.01) or returned: 0, tell the user: "I couldn't find anything in your knowledge base about this"
If initial results seem off-target, try refining the query and searching again

Multi-query strategy

For complex questions, search multiple times with different queries:

Decompose the question into sub-queries
Run each query separately
Combine and deduplicate results across queries
Synthesise a unified answer citing all relevant sources

Example:

User: "What's the difference between git rebase and merge?"

Query 1: kb search "git rebase explanation" --top 5 --format json
Query 2: kb search "git merge explanation" --top 5 --format json
Query 3: kb search "git rebase vs merge" --top 5 --format json

Filtering tips

Use filters when the question implies a specific domain:

Code question → --type code
From a specific topic → --tags <topic>
Check available tags first: kb tags --format json

Updating notes

kb updatenote 42 "revised note content"           # update note by ID

Updates the text of an existing note in place, preserving its ID, creation timestamp, and tags. Re-chunks and re-embeds the new text.

MCP server (agent integration)

For agent-to-agent integration, kb provides an MCP server alongside the CLI. The MCP server exposes the same operations as native MCP tools over Streamable HTTP transport, which agents can connect to directly without subprocess overhead.

MCP tools: kb_search, kb_addnote, kb_update_note, kb_get, kb_status, kb_jobs, kb_upload_start, kb_upload_chunk, kb_upload_finish.

The MCP server supports collections — scoped document namespaces (e.g. memory, documents, workspace) implemented via tag conventions. This is the recommended way for agents to separate their memory from user documents.

If the kb engine is already running via Docker Compose, add the MCP server by deploying the kb-mcp service from the same compose file. Agents connect to it on port 3000 (default).

Important notes

Always use --format json for machine parsing
The score field is relative, not absolute — compare scores within a result set
chunk_metadata.page is only present for PDF documents
chunk_metadata.section_header is only present for markdown documents with headers
Results are already ranked by relevance (hybrid FTS + vector search)
Duplicate files are detected at upload time (HTTP 409) — the client handles this gracefully

6.6 KiB Raw Blame History