b04823e67b
Persist uploaded files to {data_dir}/documents/{content_hash}{ext} after
successful ingestion. Add GET /documents/{id}/file endpoint for retrieval,
delete stored files on document deletion, and add `kb export` client command.
Includes schema migration, tests, and spec updates.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
39 lines
2.1 KiB
Markdown
39 lines
2.1 KiB
Markdown
## 1. Config and Schema
|
|
|
|
- [x] 1.1 Add `documents_dir` property to `Config` in `engine/kb/config.py` returning `{data_dir}/documents`
|
|
- [x] 1.2 Add `documents_dir.mkdir()` to `Config.ensure_dirs()`
|
|
- [x] 1.3 Add `stored_path TEXT` and `original_filename TEXT` columns to `documents` table in `init_schema()` (both CREATE TABLE and ALTER TABLE migration for existing DBs)
|
|
|
|
## 2. Worker — File Persistence
|
|
|
|
- [x] 2.1 In `worker._process_job()`, after successful DB commit, move staged file to `{documents_dir}/{content_hash}{ext}` using `shutil.move()`
|
|
- [x] 2.2 Update `documents.stored_path` and `documents.original_filename` (from `jobs.filename`) after moving the file
|
|
- [x] 2.3 Remove `staging.cleanup()` call for successful jobs (file is moved, not deleted); keep cleanup on failure path
|
|
|
|
## 3. API — File Download Endpoint
|
|
|
|
- [x] 3.1 Add `GET /api/v1/documents/{id}/file` route in `engine/kb/routes/documents.py` using FastAPI `FileResponse`
|
|
- [x] 3.2 Return appropriate `Content-Type` from file extension and `Content-Disposition: attachment; filename="{original_filename}"` (fall back to `{title}{ext}` if NULL)
|
|
- [x] 3.3 Handle 404 cases: document not found, `stored_path` is NULL, file missing from disk
|
|
|
|
## 4. API — Delete Cleanup
|
|
|
|
- [x] 4.1 Update `DELETE /api/v1/documents/{id}` in `engine/kb/routes/documents.py` to also delete the stored file from disk
|
|
- [x] 4.2 Handle missing file gracefully (log warning, don't fail the request)
|
|
|
|
## 5. Document Details Enhancement
|
|
|
|
- [x] 5.1 Add `has_file` boolean to `GET /api/v1/documents/{id}` response based on `stored_path` presence and file existence on disk
|
|
|
|
## 6. Go Client
|
|
|
|
- [x] 6.1 Add `kb export <doc_id>` subcommand to the Go client that calls `GET /api/v1/documents/{id}/file` and writes to stdout or a specified output path
|
|
|
|
## 7. Testing
|
|
|
|
- [x] 7.1 Test successful ingestion stores file at expected path
|
|
- [x] 7.2 Test failed ingestion does not leave file in documents dir
|
|
- [x] 7.3 Test file download endpoint returns correct content and headers
|
|
- [x] 7.4 Test document deletion removes stored file
|
|
- [x] 7.5 Test download returns 404 for documents without stored files
|