6fec627503
- Reject duplicate uploads at the API boundary (HTTP 409) instead of silently skipping in the background worker. Checks both ingested documents and in-flight jobs via content_hash on the jobs table. - Go client handles 409 with distinct messages for already-imported documents vs already-queued jobs. - Sanitize FTS5 search queries by quoting each token to prevent syntax errors from special characters like ?, *, ", (), AND, OR, NOT. - Add try/except safety net around FTS5 execute for edge cases. - Add main branch guard to release.sh to prevent releasing from feature branches. - Update specs and README to reflect new behaviour. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
25 lines
1.1 KiB
Markdown
25 lines
1.1 KiB
Markdown
## Why
|
|
|
|
Searching with natural language queries containing characters like `?`, `"`, `*`, `(`, `)`, `-`, or FTS5 keywords (`AND`, `OR`, `NOT`, `NEAR`) causes a 500 error because the raw query string is passed directly to `chunks_fts MATCH ?` without escaping. Users should be able to type anything into a search query without triggering syntax errors.
|
|
|
|
## What Changes
|
|
|
|
- **Sanitize FTS5 query input**: Escape or strip FTS5 special characters from the user's query before passing it to the MATCH operator
|
|
- **Graceful fallback**: If the sanitized query produces no valid FTS5 terms, return empty results from FTS instead of erroring
|
|
|
|
## Capabilities
|
|
|
|
### New Capabilities
|
|
|
|
_(none)_
|
|
|
|
### Modified Capabilities
|
|
|
|
- `engine-api`: The "Hybrid search" requirement changes — the engine must sanitize user queries to prevent FTS5 syntax errors for any input
|
|
|
|
## Impact
|
|
|
|
- **Engine search** (`engine/kb/search.py`): `_fts_search()` needs query sanitization before the MATCH parameter
|
|
- **No client changes**: The client already displays results or errors correctly
|
|
- **No schema changes**: No database modifications needed
|