From e9a282ddb1a6a8fd50d57fbb8ec637bc6f000a4b Mon Sep 17 00:00:00 2001 From: Steve Cliff Date: Sat, 4 Apr 2026 22:43:42 +0100 Subject: [PATCH] Document KB_BULK_SAFETY_PERCENT in README, DEVELOPER, MCP, and SKILL docs Co-Authored-By: Claude Opus 4.6 (1M context) --- DEVELOPER.md | 3 +++ MCP.md | 2 +- README.md | 1 + SKILL.md | 1 + 4 files changed, 6 insertions(+), 1 deletion(-) diff --git a/DEVELOPER.md b/DEVELOPER.md index 5331225..5e1f0fd 100644 --- a/DEVELOPER.md +++ b/DEVELOPER.md @@ -94,6 +94,9 @@ All endpoints are under `/api/v1/`. Requires `Authorization: Bearer ` heade | `GET` | `/tags` | List all tags | | `GET` | `/status` | Engine status, GPU info, DB stats | | `POST` | `/reindex` | Re-embed all chunks | +| `POST` | `/bulk/delete` | Bulk delete documents by filter | +| `POST` | `/bulk/tags` | Bulk add/remove tags by filter | +| `POST` | `/bulk/set-tags` | Bulk replace tags by filter | ## Future: ROCm runtime migration diff --git a/MCP.md b/MCP.md index 50c2a70..bc7f134 100644 --- a/MCP.md +++ b/MCP.md @@ -45,7 +45,7 @@ docker run -d --name kb-mcp \ Use tags to separate agent data from user documents. For example, an agent can tag all its notes with `agent:mybot` and filter by that tag when searching. This is a naming convention — configure it in your agent's system prompt. No special server-side enforcement is needed. -Bulk tools accept filter-based selection (by tags, doc_type, ID list, or ID range) so agents can manage thousands of documents in a single call instead of looping. A safety threshold (default 70%) prevents accidental mass operations unless `force: true` is set. +Bulk tools accept filter-based selection (by tags, doc_type, ID list, or ID range) so agents can manage thousands of documents in a single call instead of looping. A safety threshold (default 70%, configurable via engine env var `KB_BULK_SAFETY_PERCENT`) prevents accidental mass operations unless `force: true` is set. ## MCP server configuration diff --git a/README.md b/README.md index b98a760..e382d0e 100644 --- a/README.md +++ b/README.md @@ -174,6 +174,7 @@ The engine is configured via environment variables (set in the compose file or v | `KB_INGEST_DEVICE` | `auto` | Docling layout detection device: `auto`, `cpu`, or `cuda` | | `KB_API_KEY` | (none) | Optional Bearer token for API authentication | | `KB_SEARCH_THRESHOLD` | `0.01` | Minimum score for search results (filters noise) | +| `KB_BULK_SAFETY_PERCENT` | `70` | Bulk operations affecting more than this % of documents are rejected unless `force` is set (0 disables) | | `KB_PORT` | `8000` | Port to expose | | `KB_HOST` | `0.0.0.0` | Host to bind to | | `HF_HUB_OFFLINE` | (none) | Set to `1` to prevent model downloads (use cached only) | diff --git a/SKILL.md b/SKILL.md index 61e56dd..d117d8d 100644 --- a/SKILL.md +++ b/SKILL.md @@ -100,6 +100,7 @@ kb bulk-set-tags --tags "old-scheme" --set "new-scheme,migrated" --yes All bulk commands return a summary: matched count, succeeded count, failed count, and errors. A safety threshold prevents accidentally affecting more than 70% of documents unless `--force` is used. +The threshold is configurable on the engine via `KB_BULK_SAFETY_PERCENT` (integer 0-100, default 70; 0 disables). ## Jobs (ingestion queue)