Files
kb/openspec/changes/archive/2026-03-26-upload-time-dedup-check/tasks.md
T
steve 6fec627503 Upload-time duplicate detection, FTS5 query sanitization, release guard
- Reject duplicate uploads at the API boundary (HTTP 409) instead of
  silently skipping in the background worker. Checks both ingested
  documents and in-flight jobs via content_hash on the jobs table.
- Go client handles 409 with distinct messages for already-imported
  documents vs already-queued jobs.
- Sanitize FTS5 search queries by quoting each token to prevent syntax
  errors from special characters like ?, *, ", (), AND, OR, NOT.
- Add try/except safety net around FTS5 execute for edge cases.
- Add main branch guard to release.sh to prevent releasing from
  feature branches.
- Update specs and README to reflect new behaviour.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 23:05:07 +00:00

1.3 KiB

1. Database Layer

  • 1.1 Add get_document_by_hash(conn, content_hash) function to engine/kb/database.py that returns (document_id, title) or None

2. Upload Endpoint

  • 2.1 Update submit_job() in engine/kb/routes/jobs.py to compute SHA256 hash of uploaded file bytes before staging
  • 2.2 Add duplicate check: call get_document_by_hash() and return HTTP 409 with {"error": "duplicate", "document_id": <id>, "title": "<title>"} if match found
  • 2.3 Apply same hash check for note submissions (hash the UTF-8 encoded note text)

3. Go Client

  • 3.1 Update uploadFile() in client/cmd/add.go to handle HTTP 409 responses — parse the JSON body and print "Already imported: