Add bulk operations and remove collections abstraction

- Add bulk delete, bulk tags, and bulk set-tags engine endpoints
  (POST /api/v1/bulk/delete, /bulk/tags, /bulk/set-tags)
- Filter-based selection: by tags, doc_type, ID list, ID range
- Safety threshold (KB_BULK_SAFETY_PERCENT, default 70%) prevents
  accidental mass operations unless force=true
- Synchronous execution with audit trail via jobs table
- Add kb_bulk_delete, kb_bulk_tags, kb_bulk_set_tags MCP tools
- Add kb bulk-remove, bulk-tag, bulk-set-tags CLI commands
- Remove collection abstraction from MCP server (use tags instead)
- Remove kb_set_collection MCP tool
- Update SKILL.md, MCP.md, README.md documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-04 22:34:47 +01:00
parent 0c124c4ab7
commit b5a203d2aa
21 changed files with 1619 additions and 112 deletions
+36 -5
View File
@@ -71,6 +71,36 @@ kb tag <doc_id> --add important,ops # add tags to a document
kb tag <doc_id> --remove draft # remove tags from a document
```
## Bulk operations
Operate on multiple documents at once using filter-based selection. Filters combine with AND logic.
**Filter flags (shared across all bulk commands):**
- `--tags tag1,tag2` — match documents with ALL specified tags
- `--type pdf|note|...` — match by document type
- `--ids 1,5,12` — match specific document IDs
- `--from-id N` — match documents with id >= N
- `--to-id N` — match documents with id <= N
- `--force` / `-f` — override safety threshold (blocks operations affecting >70% of all documents)
- `--yes` / `-y` — skip confirmation prompt
```bash
# Bulk delete
kb bulk-remove --tags "draft,old" --type note --yes # delete matching docs
kb bulk-remove --from-id 10 --to-id 50 --yes # delete by ID range
kb bulk-remove --ids "3,7,12" --yes # delete specific IDs
# Bulk tag add/remove
kb bulk-tag --tags "agent:mybot" --add "reviewed" --remove "pending" --yes
kb bulk-tag --type note --add "archived" --yes # tag all notes
# Bulk replace tags
kb bulk-set-tags --tags "old-scheme" --set "new-scheme,migrated" --yes
```
All bulk commands return a summary: matched count, succeeded count, failed count, and errors.
A safety threshold prevents accidentally affecting more than 70% of documents unless `--force` is used.
## Jobs (ingestion queue)
```bash
@@ -172,12 +202,13 @@ For agent-to-agent integration, kb provides an MCP server alongside the CLI. The
exposes the same operations as native MCP tools over Streamable HTTP transport, which agents
can connect to directly without subprocess overhead.
**MCP tools:** `kb_search`, `kb_addnote`, `kb_update_note`, `kb_get`, `kb_status`, `kb_jobs`,
`kb_upload_start`, `kb_upload_chunk`, `kb_upload_finish`.
**MCP tools:** `kb_search`, `kb_addnote`, `kb_update_note`, `kb_get`, `kb_delete`, `kb_status`,
`kb_jobs`, `kb_upload_start`, `kb_upload_chunk`, `kb_upload_finish`, `kb_bulk_delete`,
`kb_bulk_tags`, `kb_bulk_set_tags`.
The MCP server supports **collections** — scoped document namespaces (e.g. `memory`, `documents`,
`workspace`) implemented via tag conventions. This is the recommended way for agents to separate
their memory from user documents.
Use tags to separate agent data from user documents (e.g. tag all agent notes with
`agent:mybot` and filter by that tag when searching). This convention is communicated
via system prompt — no special server-side enforcement needed.
If the kb engine is already running via Docker Compose, add the MCP server by deploying the
`kb-mcp` service from the same compose file. Agents connect to it on port 3000 (default).