Files
restic-manager/docs/book/src/operations/updates.md
T
steve 89537d417a P5: OSS readiness — docs site, contributor onboarding, e2e harness
P5-01 — Documentation site under docs/book/ rendered with mdBook
(downloaded via Makefile, same static-binary pattern as Tailwind).
Structured chapters: getting started, concepts, operations,
security, reference. `make docs` / `make docs-watch`. Generated
output gitignored.

P5-02 — CONTRIBUTING.md rewritten from placeholder to a full
guide. CODE_OF_CONDUCT.md adapted from Contributor Covenant for a
single-maintainer project. .gitea/issue_template/{bug,feature}.md
and PULL_REQUEST_TEMPLATE.md.

P5-04 — Six README screenshots captured live from a fresh server
bootstrap (login, empty dashboard, add-host, alerts, settings,
audit log). README rewritten to centre the screenshot grid and
link out to the docs site.

P5-05 — SECURITY.md with disclosure policy (3-day ack, 30-day
default window), scope in/out, threat-model summary, operator
hardening checklist. Mirrored as a docs-site chapter.

P5-06 — End-to-end test harness. e2e/compose.e2e.yml brings up
server + sibling Linux agent (alpine + restic) + restic/rest-server.
Agent uses announce-and-approve so Playwright can drive the full
operator flow: bootstrap → login → accept pending → backup →
verify terminal status. Second spec scrapes /metrics to assert
the P6-04 endpoint surface. .gitea/workflows/e2e.yml runs on every
PR; local how-to in docs/e2e.md.
2026-05-08 20:08:23 +01:00

1.8 KiB

Updating agents

Server updates are a docker compose pull && up -d away. Agents update via the control plane.

Single-host update

Each host's detail page shows an Update agent button when the agent's reported version is older than the server's. The button:

  1. Dispatches a command.update to that host.
  2. The agent fetches the appropriate binary from $RM_SERVER/agent/binary?os=…&arch=… to <binary-path>.new.
  3. Copies the running binary to <binary-path>.old (one revision back, in case rollback is needed).
  4. Atomic-renames .new over the running binary.
  5. Exits cleanly. systemd's Restart=always (or Windows SCM) brings the process back on the new binary.

A 90-second timer on the server side waits for a hello at the target version and marks the update succeeded — or, if the agent doesn't reconnect at the expected version in time, marks the update failed and raises an update_failed alert.

Fleet update

The admin-only Settings → Fleet update page drives a rolling update across every host in the fleet:

  • One host at a time.
  • Wait for hello-with-target-version (max 95s).
  • On any host failing, halt the rollout, raise a fleet_update_halted alert, leave the rest of the fleet on the old version. No surprise mass-failures.

You can cancel an in-progress fleet update; the worker stops after the current host finishes.

TLS and corruption

Updates rely on the reverse proxy's TLS to detect corruption in transit. There's no separate sha256 verification step — we chose the simpler model on the basis that the same TLS already gates every other byte the server hands to the agent.

If you'd like a separate signature step before applying updates, that's a future-phase enhancement (see tasks.md Phase 6 candidates).