- Add bulk delete, bulk tags, and bulk set-tags engine endpoints (POST /api/v1/bulk/delete, /bulk/tags, /bulk/set-tags) - Filter-based selection: by tags, doc_type, ID list, ID range - Safety threshold (KB_BULK_SAFETY_PERCENT, default 70%) prevents accidental mass operations unless force=true - Synchronous execution with audit trail via jobs table - Add kb_bulk_delete, kb_bulk_tags, kb_bulk_set_tags MCP tools - Add kb bulk-remove, bulk-tag, bulk-set-tags CLI commands - Remove collection abstraction from MCP server (use tags instead) - Remove kb_set_collection MCP tool - Update SKILL.md, MCP.md, README.md documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4.7 KiB
Why
Bulk operations on documents (delete, tag, retag) currently require one API/MCP call per document. When an LLM manages hundreds or thousands of documents, this means hundreds of tool calls — burning tokens, adding latency, and creating fragile multi-step flows that can fail partway through.
Additionally, the "collection" abstraction in the MCP server adds complexity without real benefit. Collections are implemented as collection:-prefixed tags, but this convention is only enforced in the MCP layer — the CLI and engine don't know about it. This creates inconsistency and extra code. Tags alone, with a naming convention communicated via system prompt or configuration, achieve the same namespace isolation more simply and uniformly.
What Changes
1. Remove collections from MCP server
Strip all collection logic from mcp/server.py:
- Remove
COLLECTION_TAG_PREFIX,DEFAULT_COLLECTION, and all collection helper functions - Remove
collectionparameter fromkb_search,kb_addnote,kb_upload_start - Remove
kb_set_collectiontool entirely - Remove
_process_document/_process_search_resultscollection-tag stripping - Update MCP server instructions to explain tag-based namespace convention
2. Add bulk engine endpoints
Three new endpoints in the engine API:
- POST /api/v1/bulk/delete — Delete multiple documents matching a filter
- POST /api/v1/bulk/tags — Add/remove tags on multiple documents matching a filter
- POST /api/v1/bulk/set-tags — Replace all tags on multiple documents matching a filter
All accept a common selection filter (combinable with AND logic):
document_ids— explicit list of IDstags— documents matching ALL specified tagsdoc_type— documents of this typefrom_id/to_id— ID range (inclusive)
At least one selection criterion is required.
Safety threshold: If the operation would affect more than N% of all documents (default 70%, configurable via KB_BULK_SAFETY_PERCENT env var), the request is rejected with a 409 response showing what would be affected. The caller must re-send with force: true to proceed.
Response model: Synchronous execution with summary response. The operation is logged to the jobs table for audit trail:
{
"job_id": 42,
"status": "done",
"matched": 750,
"succeeded": 748,
"failed": 2,
"errors": [
{"document_id": 42, "error": "file locked"},
{"document_id": 99, "error": "not found"}
]
}
3. Add bulk MCP tools
Expose the bulk engine endpoints as MCP tools:
kb_bulk_delete— bulk delete with filter selectionkb_bulk_tags— bulk add/remove tags with filter selectionkb_bulk_set_tags— bulk replace tags with filter selection
These are thin wrappers around the engine bulk endpoints — no collection translation, no special logic.
4. Add bulk CLI commands
kb bulk-remove— bulk delete with--tags,--type,--ids,--from-id,--to-id,--forceflagskb bulk-tag— bulk tag/untag with--add,--remove, and the same filter flagskb bulk-set-tags— bulk replace tags with--tags(new tags) and the same filter flags
All show a confirmation prompt with match count before executing (unless --yes).
Capabilities
New Capabilities
bulk-operations: Engine endpoints, MCP tools, and CLI commands for bulk delete, tag, and set-tags operations with filter-based selection and safety threshold.
Modified Capabilities
mcp-document-management: Removekb_set_collectiontool. Removecollectionparameter from all tools.
Removed Capabilities
mcp-collections: The collection abstraction (collection helpers, collection parameters, collection tag stripping) is removed from the MCP server entirely.
Impact
- Engine API (
engine/kb/routes/): Newbulk.pyroute module with 3 endpoints. Newbulkjob type in jobs table. - Engine database (
engine/kb/database.py): Helper functions for bulk selection queries and bulk delete/tag operations. - MCP server (
mcp/server.py): Remove ~70 lines of collection logic. Add 3 bulk tool definitions. Removecollectionparam fromkb_search,kb_addnote,kb_upload_start. Removekb_set_collection. - MCP engine client (
mcp/engine.py): Add bulk operation methods. Remove no longer needed code. - CLI (
client/cmd/): Newbulk_remove.go,bulk_tag.go,bulk_set_tags.gocommand files. - CLI API client (
client/internal/api/): AddPostwith JSON body support if not present. - Breaking changes:
kb_set_collectionMCP tool removed.collectionparameter removed fromkb_search,kb_addnote,kb_upload_startMCP tools. Any MCP clients using collections will need to switch to tags.