Add bulk operations and remove collections abstraction

- Add bulk delete, bulk tags, and bulk set-tags engine endpoints
  (POST /api/v1/bulk/delete, /bulk/tags, /bulk/set-tags)
- Filter-based selection: by tags, doc_type, ID list, ID range
- Safety threshold (KB_BULK_SAFETY_PERCENT, default 70%) prevents
  accidental mass operations unless force=true
- Synchronous execution with audit trail via jobs table
- Add kb_bulk_delete, kb_bulk_tags, kb_bulk_set_tags MCP tools
- Add kb bulk-remove, bulk-tag, bulk-set-tags CLI commands
- Remove collection abstraction from MCP server (use tags instead)
- Remove kb_set_collection MCP tool
- Update SKILL.md, MCP.md, README.md documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-04 22:34:47 +01:00
parent 0c124c4ab7
commit b5a203d2aa
21 changed files with 1619 additions and 112 deletions
+8 -8
View File
@@ -27,25 +27,25 @@ docker run -d --name kb-mcp \
| Tool | Description |
|---|---|
| `kb_search` | Hybrid search with optional collection/tag/type filters |
| `kb_search` | Hybrid search with optional tag/type filters |
| `kb_addnote` | Add a text note (queued for async ingestion) |
| `kb_update_note` | Update an existing note in place |
| `kb_get` | Get document details by ID or source path |
| `kb_delete` | Permanently delete a document by ID |
| `kb_status` | Engine health and statistics |
| `kb_jobs` | Ingestion queue status |
| `kb_upload_start` | Start a chunked file upload |
| `kb_upload_chunk` | Upload a base64-encoded file chunk |
| `kb_upload_finish` | Finish upload and submit for ingestion |
| `kb_bulk_delete` | Delete multiple documents matching a filter |
| `kb_bulk_tags` | Add/remove tags on multiple documents |
| `kb_bulk_set_tags` | Replace all tags on multiple documents |
## Collections
## Organising with tags
The MCP server supports **collections** — scoped document namespaces implemented via tag conventions. Use these to separate agent memory from user documents:
Use tags to separate agent data from user documents. For example, an agent can tag all its notes with `agent:mybot` and filter by that tag when searching. This is a naming convention — configure it in your agent's system prompt. No special server-side enforcement is needed.
- `documents` (default) — user-facing documents
- `memory` — agent memory and preferences
- `workspace` — working context
Tools accept a `collection` parameter. The MCP server translates this to `collection:<name>` tags on the engine, and strips them from responses so agents see a clean `"collection": "memory"` field.
Bulk tools accept filter-based selection (by tags, doc_type, ID list, or ID range) so agents can manage thousands of documents in a single call instead of looping. A safety threshold (default 70%) prevents accidental mass operations unless `force: true` is set.
## MCP server configuration