Upload-time duplicate detection, FTS5 query sanitization, release guard
- Reject duplicate uploads at the API boundary (HTTP 409) instead of silently skipping in the background worker. Checks both ingested documents and in-flight jobs via content_hash on the jobs table. - Go client handles 409 with distinct messages for already-imported documents vs already-queued jobs. - Sanitize FTS5 search queries by quoting each token to prevent syntax errors from special characters like ?, *, ", (), AND, OR, NOT. - Add try/except safety net around FTS5 execute for edge cases. - Add main branch guard to release.sh to prevent releasing from feature branches. - Update specs and README to reflect new behaviour. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,15 @@
|
||||
## 1. Query Sanitization
|
||||
|
||||
- [x] 1.1 Add `_sanitize_fts_query(query)` function to `engine/kb/search.py` that splits on whitespace, strips double quotes from each token, wraps each token in double quotes, and joins with spaces
|
||||
- [x] 1.2 Handle edge case: if no valid tokens remain after sanitization, return empty dict from `_fts_search` without executing the query
|
||||
|
||||
## 2. Integration
|
||||
|
||||
- [x] 2.1 Call `_sanitize_fts_query()` in `_fts_search()` before adding the query to params (line 92)
|
||||
- [x] 2.2 Add try/except `sqlite3.OperationalError` around the FTS5 execute call — log a warning and return empty results on error
|
||||
|
||||
## 3. Testing
|
||||
|
||||
- [x] 3.1 Test: `kb search "what color is grass?"` returns results, not a 500 error
|
||||
- [x] 3.2 Test: `kb search "NOT something OR (other)"` returns results, treating input as literal terms
|
||||
- [x] 3.3 Test: query with only special characters returns empty results, not an error
|
||||
Reference in New Issue
Block a user