b04823e67b
Persist uploaded files to {data_dir}/documents/{content_hash}{ext} after
successful ingestion. Add GET /documents/{id}/file endpoint for retrieval,
delete stored files on document deletion, and add `kb export` client command.
Includes schema migration, tests, and spec updates.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2.1 KiB
2.1 KiB
1. Config and Schema
- 1.1 Add
documents_dirproperty toConfiginengine/kb/config.pyreturning{data_dir}/documents - 1.2 Add
documents_dir.mkdir()toConfig.ensure_dirs() - 1.3 Add
stored_path TEXTandoriginal_filename TEXTcolumns todocumentstable ininit_schema()(both CREATE TABLE and ALTER TABLE migration for existing DBs)
2. Worker — File Persistence
- 2.1 In
worker._process_job(), after successful DB commit, move staged file to{documents_dir}/{content_hash}{ext}usingshutil.move() - 2.2 Update
documents.stored_pathanddocuments.original_filename(fromjobs.filename) after moving the file - 2.3 Remove
staging.cleanup()call for successful jobs (file is moved, not deleted); keep cleanup on failure path
3. API — File Download Endpoint
- 3.1 Add
GET /api/v1/documents/{id}/fileroute inengine/kb/routes/documents.pyusing FastAPIFileResponse - 3.2 Return appropriate
Content-Typefrom file extension andContent-Disposition: attachment; filename="{original_filename}"(fall back to{title}{ext}if NULL) - 3.3 Handle 404 cases: document not found,
stored_pathis NULL, file missing from disk
4. API — Delete Cleanup
- 4.1 Update
DELETE /api/v1/documents/{id}inengine/kb/routes/documents.pyto also delete the stored file from disk - 4.2 Handle missing file gracefully (log warning, don't fail the request)
5. Document Details Enhancement
- 5.1 Add
has_fileboolean toGET /api/v1/documents/{id}response based onstored_pathpresence and file existence on disk
6. Go Client
- 6.1 Add
kb export <doc_id>subcommand to the Go client that callsGET /api/v1/documents/{id}/fileand writes to stdout or a specified output path
7. Testing
- 7.1 Test successful ingestion stores file at expected path
- 7.2 Test failed ingestion does not leave file in documents dir
- 7.3 Test file download endpoint returns correct content and headers
- 7.4 Test document deletion removes stored file
- 7.5 Test download returns 404 for documents without stored files