tasks: rewrite P6-01/02 around server-bundled agent self-update

The original plan was apt repo + Chocolatey package. The P5-03 Docker
pivot bundled matching agent binaries into the server image and
exposes them via /agent/binary, so 'update agent' now collapses to
're-fetch from your own server'. No third-party packaging or signing
infra needed. P6-01 drops to S; P6-02 keeps the dashboard reporting
+ fleet-update UX but points at the new mechanism.
This commit is contained in:
2026-05-06 21:08:22 +01:00
parent c32acc0332
commit c80ca90efb
+3 -2
View File
@@ -344,8 +344,8 @@ Sizes: **S** = under a day, **M** = 13 days, **L** = 37 days.
> Deferred from Phase 4 on 2026-05-05 — operator-experience polish that doesn't gate a working v1.
- [ ] **P6-01** (M) Update delivery via OS package managers — host an apt repo (Linux) and Chocolatey package (Windows) on gitea releases. `restic-manager-agent update` is a thin wrapper over `apt-get install --only-upgrade restic-manager-agent` / `choco upgrade`. Trades flexibility for a much smaller security surface than bespoke signed binaries (see spec.md §4.2). _(Was P4-01.)_
- [ ] **P6-02** (M) Agent version reporting on dashboard: surface "agent N versions behind server"; "update all" admin action calls the package-manager wrapper on each host. _(Was P4-02.)_
- [ ] **P6-01** (S) Agent self-update from the server's bundled binaries. P5-03 already bakes matching `agent-{linux-amd64,linux-arm64,windows-amd64}` into the server image under `/opt/restic-manager/dist/`, served by `/agent/binary`. Add a `restic-manager-agent update` subcommand (and a server-dispatched `command.update` WS envelope) that fetches `$RM_SERVER/agent/binary?os=…&arch=…`, verifies sha256 against a digest the server advertises alongside the binary, atomic-renames over the running binary (`tmp+fsync+rename`), and asks the service manager to restart (`systemctl restart` on Linux, SCM restart on Windows). Version pinning is automatic — the server only ever serves the agent that matches its own release. No apt repo, no Chocolatey, no third-party signing infra. _(Was P4-01; original apt/choco plan dropped after the P5-03 Docker pivot made the server the natural distribution point.)_
- [ ] **P6-02** (M) Agent version reporting + fleet update on dashboard. Server already knows its own build version and each agent's `agent_version` from the WS hello. Surface "N hosts behind" on the dashboard, a per-host "out of date" chip, and an admin-only **Update all** action that fans out `command.update` to every online host (offline hosts queue via `pending_runs`-style retry on reconnect). Per-host **Update** button on host detail for one-shot upgrades. Audit-logged. _(Was P4-02.)_
- [ ] **P6-03** (M) Repo size trend graphs (sparkline on host card, full chart on repo page). _(Was P4-06.)_
- [ ] **P6-04** (M) Prometheus `/metrics` endpoint: per-host gauges (last backup timestamp, last backup status, repo size, snapshot count, agent online), server gauges (active alerts, build info), job duration histograms; protected by bearer token or IP allow-list. _(Was P4-08.)_
- [ ] **P6-05** (S) Document Prometheus integration + sample Grafana dashboard JSON. _(Was P4-09.)_
@@ -385,4 +385,5 @@ Sizes: **S** = under a day, **M** = 13 days, **L** = 37 days.
> security review finding, a real disaster-recovery exercise) bumps them
> back into a phase.
- [ ] **F-02** API tokens (PATs) for automation. Today the only way to drive `/api/*` from a tool is to log in as a real user and reuse the `rm_session` cookie — fine for a single automation account, but bearer-equivalent for the 24h session TTL and not revocable per-tool. Build a proper personal-access-token feature: new `personal_access_tokens` table (id, user_id, sha256 hash, name, optional role cap, created_at, last_used_at, revoked_at), a `/settings/tokens` UI to mint/list/revoke, and a branch in `requireUser` that accepts `Authorization: Bearer …` and falls back to the cookie. Reuse `auth.NewToken()` / `auth.HashToken()` (same primitives used for agent bearers). Audit each mint/revoke. Trigger to promote: second automation consumer, or any external integration request.
- [ ] **F-01** ~~P3-04~~ Cross-host restore. De-scoped from Phase 3 on 2026-05-04. Disaster recovery is already covered: stand up a replacement host, paste the original repo creds at enrolment, snapshots reappear, restore is same-host. The remaining "pull a file from host A onto host C without granting C permanent access" use case is genuinely different (file sharing / migration, not DR) and hasn't been requested. Original spec language was: "target agent receives a temporary scoped read credential for source host's repo (single-job, auto-revoked); UI supports source→target path remapping; warns when source paths need root and target service user is non-root". Re-promote when there's a real ask.