- Add Dockerfile.cpu and compose.cpu.yaml for CPU-only deployments
- Use sentence-transformers[onnx] + CPU-only torch for ~4x smaller image
- Fix release script: separate git tags (engine-v*) from Docker tags (v*)
- Add CPU image to release build/push pipeline
- Update README with CPU deployment instructions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split release.sh into release-client.sh and release-engine.sh for
independent release cadences. Client checks engine version on first
API call and hard-fails if engine is below MinEngineVersion.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>