Upload-time duplicate detection, FTS5 query sanitization, release guard

- Reject duplicate uploads at the API boundary (HTTP 409) instead of
  silently skipping in the background worker. Checks both ingested
  documents and in-flight jobs via content_hash on the jobs table.
- Go client handles 409 with distinct messages for already-imported
  documents vs already-queued jobs.
- Sanitize FTS5 search queries by quoting each token to prevent syntax
  errors from special characters like ?, *, ", (), AND, OR, NOT.
- Add try/except safety net around FTS5 execute for edge cases.
- Add main branch guard to release.sh to prevent releasing from
  feature branches.
- Update specs and README to reflect new behaviour.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-26 23:05:07 +00:00
parent 63654a59b8
commit 6fec627503
20 changed files with 536 additions and 30 deletions
+2 -2
View File
@@ -1,4 +1,4 @@
# kb-search
# kb
Personal knowledge base with hybrid search (full-text + semantic vector search).
@@ -129,7 +129,7 @@ All endpoints are under `/api/v1/`. Requires `Authorization: Bearer <key>` heade
|---|---|---|
| `GET` | `/health` | Health check (bypasses auth) |
| `POST` | `/search` | Hybrid search (JSON body) |
| `POST` | `/jobs` | Upload file/note for ingestion (multipart, returns 202) |
| `POST` | `/jobs` | Upload file/note for ingestion (multipart, returns 202 or 409 if duplicate) |
| `GET` | `/jobs` | List ingestion jobs |
| `GET` | `/jobs/{id}` | Job details |
| `GET` | `/documents` | List documents |