Upload-time duplicate detection, FTS5 query sanitization, release guard

- Reject duplicate uploads at the API boundary (HTTP 409) instead of
  silently skipping in the background worker. Checks both ingested
  documents and in-flight jobs via content_hash on the jobs table.
- Go client handles 409 with distinct messages for already-imported
  documents vs already-queued jobs.
- Sanitize FTS5 search queries by quoting each token to prevent syntax
  errors from special characters like ?, *, ", (), AND, OR, NOT.
- Add try/except safety net around FTS5 execute for edge cases.
- Add main branch guard to release.sh to prevent releasing from
  feature branches.
- Update specs and README to reflect new behaviour.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-26 23:05:07 +00:00
parent 63654a59b8
commit 6fec627503
20 changed files with 536 additions and 30 deletions
@@ -0,0 +1,24 @@
## Why
Searching with natural language queries containing characters like `?`, `"`, `*`, `(`, `)`, `-`, or FTS5 keywords (`AND`, `OR`, `NOT`, `NEAR`) causes a 500 error because the raw query string is passed directly to `chunks_fts MATCH ?` without escaping. Users should be able to type anything into a search query without triggering syntax errors.
## What Changes
- **Sanitize FTS5 query input**: Escape or strip FTS5 special characters from the user's query before passing it to the MATCH operator
- **Graceful fallback**: If the sanitized query produces no valid FTS5 terms, return empty results from FTS instead of erroring
## Capabilities
### New Capabilities
_(none)_
### Modified Capabilities
- `engine-api`: The "Hybrid search" requirement changes — the engine must sanitize user queries to prevent FTS5 syntax errors for any input
## Impact
- **Engine search** (`engine/kb/search.py`): `_fts_search()` needs query sanitization before the MATCH parameter
- **No client changes**: The client already displays results or errors correctly
- **No schema changes**: No database modifications needed