Agent calls restic snapshots --json after each successful backup
(60s timeout, separate from the backup ctx) and ships the projection
over the existing snapshots.report WS envelope. Failure here is
logged but doesn't fail the job — the next successful backup catches
the projection up.
Server-side ReplaceHostSnapshots is delete-then-insert plus a
hosts.snapshot_count update in one transaction so the dashboard's
per-host count stays consistent with the projection. New read
endpoint GET /api/hosts/{id}/snapshots returns the cached list with
a refreshed_at marker so the UI can show staleness when an agent
has been offline.
Schema: dropped the unused snapshots.repo_id FK (repos as a
first-class entity is P2 work), added short_id and refreshed_at
columns, switched the time index to DESC for the most-recent-first
list query. api.Snapshot gains short_id; size_bytes/file_count come
from the embedded summary block on restic 0.16+ and stay zero on
older clients.
Tests cover round-trip, authoritative replacement after forget+prune
shrinkage, and empty-after-wipe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the state landed in this session:
Done (P1-01..03, P1-05, P1-06, P1-08..16, P1-17..20, P1-29):
HTTP server, store + schema, crypto, first-run bootstrap,
every API type with wire-shape tests, WS transport,
enrollment + hello + heartbeat round-trip, agent config +
service unit + WS client + sysinfo, restic wrapper, job
lifecycle store + run-now endpoint, agent runner.
Partial (P1-04, P1-07, P1-21, P1-31):
CSRF middleware lives with the UI work; audit middleware
sweep lives with rest of API; live job-log fan-out needs
the per-job browser hub; signed agent binaries deferred to
Phase 5.
Open (P1-22..28):
Snapshot listing, full UI suite (login, dashboard, host
detail, live job log, add-host, Tailwind build).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the bottom three layers of Phase 1:
P1-08 internal/api: protocol_version + envelope + every WS message
shape from spec.md §6.2 (Hello, Heartbeat, Job*, Schedule*, etc).
Wire-format tests pin the JSON shape so a rename here breaks
tests instead of silently breaking the agent.
P1-02 + P1-03 internal/store: SQLite via modernc.org/sqlite,
embed.FS + a tiny version table for hand-rolled migrations.
0001_initial.sql covers every table from spec.md §5 plus
enrollment_tokens and host_schedule_version. Typed accessors
for users / sessions / enrollment / audit. WAL + foreign_keys
+ busy_timeout on by default.
P1-06 internal/crypto: XChaCha20-Poly1305 AEAD wrapper with
per-message random nonce. Key file lifecycle (generate +
refuse-to-overwrite, load with size validation). Optional
additionalData binds ciphertext to the row that owns it.
P1-04 internal/auth (partial — passwords + tokens; sessions
middleware lands with the HTTP handlers): argon2id following
RFC 9106 (64 MiB / t=3 / p=4 / 32B), constant-time verify.
HashToken stores SHA-256 of session/agent/enrollment tokens
so a stolen DB doesn't hand over credentials.
Build floor moves to Go 1.25 (modernc.org/sqlite v1.50+ requires
it); CI + Dockerfile + README updated. Markdown lint diagnostics
on tasks.md cleared.
All packages tested. ~70 new tests pass in <1s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Doc-only changes captured before any Phase 1 code lands.
spec.md:
- §4.1 nhooyr.io/websocket → github.com/coder/websocket (the
maintained fork; the original is unmaintained)
- §4.1 RM_LISTEN documented as source of truth for the bind port;
add RM_TRUSTED_PROXY env var for X-Forwarded-* handling behind
Caddy/Traefik
- §4.2 Phase 1 ships Linux only; Windows binaries continue to build
in CI to keep the codebase portable, but service integration +
installer move to Phase 2
- §4.2 self-update via apt/choco, not bespoke signed binaries
- §5 add Host.protocol_version + Host.applied_schedule_version
- §6.2 lock protocol_version handshake semantics (clean error on
mismatch, not weird JSON parse failures)
- §6.2 schedule reconciliation when server unreachable: agent keeps
firing last-known-good indefinitely; server's view canonical on
reconnect; UI surfaces drift via applied_schedule_version
- §6.2 schedule.set carries schedule_version; new schedule.ack
agent→server message
- §10.1 cross-reference RM_LISTEN ↔ compose port mapping
- §14.3 hooks rejected at validation on non-backup schedule kinds
tasks.md:
- P1-14 / P1-30 (Windows service + install.ps1) → Phase 2 as
P2-16 / P2-17
- P1-29 install.sh detects existing restic timers/cron and prints
disable commands, doesn't auto-disable
- Phase 1 acceptance: drop Windows from end-to-end criterion,
require windows cross-compile in CI
- P4-01 rewritten: package-manager-based update delivery
- P5-08 removed (duplicate of P4-08 Prometheus /metrics)
- Various references updated
No Go code changes; build still clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>