6fec627503
- Reject duplicate uploads at the API boundary (HTTP 409) instead of silently skipping in the background worker. Checks both ingested documents and in-flight jobs via content_hash on the jobs table. - Go client handles 409 with distinct messages for already-imported documents vs already-queued jobs. - Sanitize FTS5 search queries by quoting each token to prevent syntax errors from special characters like ?, *, ", (), AND, OR, NOT. - Add try/except safety net around FTS5 execute for edge cases. - Add main branch guard to release.sh to prevent releasing from feature branches. - Update specs and README to reflect new behaviour. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1.3 KiB
1.3 KiB
MODIFIED Requirements
Requirement: Hybrid search
The engine SHALL provide hybrid search combining BM25 full-text search (via FTS5) and vector similarity search (via sqlite-vec), merged using Reciprocal Rank Fusion. Search SHALL complete in under 100ms when the model is warm. The engine SHALL sanitize user query strings to prevent FTS5 syntax errors for any input.
Scenario: Search with special characters
- WHEN a client sends
POST /api/v1/searchwith body{"query": "what color is grass?"} - THEN the engine SHALL sanitize the query for FTS5, execute the search successfully, and return results (not a 500 error)
Scenario: Search with FTS5 operators in query
- WHEN a client sends
POST /api/v1/searchwith body{"query": "NOT something OR (other)"} - THEN the engine SHALL treat the input as literal search terms, not FTS5 operators, and return matching results
Scenario: Search with only special characters
- WHEN a client sends
POST /api/v1/searchwith body{"query": "??!@#"} - THEN the engine SHALL return HTTP 200 with an empty result set (not a 500 error)
Scenario: Search with quotes in query
- WHEN a client sends
POST /api/v1/searchwith body{"query": "the \"quick\" fox"} - THEN the engine SHALL sanitize embedded quotes and return results normally