Files

T

steve 1d36dcd668 v1 readiness: CHANGELOG + threat model + first-run onboarding polish

- CHANGELOG.md: Keep-a-Changelog format, v1.0.0 entry summarising
  what each phase delivered.
- docs/threat-model.md: structured walkthrough of assets, actors,
  attack surfaces and residual risks; reviewed against v1.0.0.
- cmd/server/main.go: at first-run startup, print a clickable
  $RM_BASE_URL/bootstrap URL alongside the existing one-shot
  bootstrap token (or a fallback hint when RM_BASE_URL is unset).
- web/templates/pages/bootstrap.html: visible "Minimum 12 characters"
  hint under the password field so the rule is communicated
  before the operator submits.
- tasks.md: close X-01, X-04, X-05 with notes.

2026-05-09 12:29:00 +01:00

8.0 KiB

Raw Blame History

Threat model

A short, structured walkthrough of the assets restic-manager protects, the actors that interact with it, the attack surfaces exposed, and the mitigations in place. This document is written for operators considering a deployment and for contributors evaluating security-sensitive changes. It is not a formal certification — restic-manager has not been third-party audited.

Last reviewed: 2026-05-09 (against v1.0.0).

1. Assets

In rough order of sensitivity:

Asset	Why it matters
Restic repository passwords	Decrypt every backup in the repo. Server holds them encrypted at rest; agents need plaintext at backup-time.
Repository URLs with embedded credentials (e.g. `rest:https://user:pass@host/repo`)	Same as above — read access to the repo is leak-equivalent to the password.
Agent bearer tokens	Long-lived credentials authenticating each agent → server WS. Compromise lets an attacker impersonate that host (push fake snapshots, ack fake schedule versions, exfiltrate repo creds the server pushes back).
Server session cookies	Browser-side session for human operators. Compromise = full UI access at the user's role for the cookie's TTL (24h).
Database secret key	Wraps every encrypted-at-rest field (repo creds, agent enrolment payloads). Loss of the file means decryptable backups; rotation requires re-pushing creds to every agent.
Bootstrap / setup tokens	One-shot, time-limited; mint admin or invited-user accounts.
Audit log	Tamper-evident record of admin actions; read-only via UI.
Backup data on the wire	Restic itself encrypts on the agent before sending — see "out of scope".

2. Actors

Actor	Trust
Anonymous internet	Untrusted. Should not reach the server unless proxied behind auth (see deployment guide).
Authenticated viewer	Read-only on hosts/jobs/alerts/audit.
Authenticated operator	Add/remove hosts, edit schedules, run backups/restores, mint enrolment tokens, ack alerts.
Authenticated admin	All of the above plus user management, role changes, fleet update controls, secret-key visibility (no — see below).
Agent	Trusted to backup-and-report on its own host only. Cannot read other hosts' creds. Bearer-authenticated.
Restic backend (rest-server / S3 / B2 / etc.)	Out of scope for this document — assumed to authenticate the credentials presented and not collude.

3. Attack surfaces and mitigations

3.1 First-run bootstrap

Surface: /bootstrap UI + /api/bootstrap JSON endpoint.
Risk: race between server start and admin creation — an attacker who reaches the server first can claim admin.
Mitigations:
- Bootstrap token printed to stderr exactly once; held in memory, not persisted.
- The UI form on /bootstrap uses the in-memory token automatically (no token field for the operator to type or expose).
- Both surfaces self-disable the moment any user row exists (CountUsers > 0).
- Token is also blanked from process memory after success (defence in depth).
Residual risk: if an operator brings up the server on the public internet before reaching the bootstrap page, an attacker reaching /bootstrap first wins. Recommendation: bring the server up behind an existing trusted network or with the listener bound to 127.0.0.1 until first-run is complete.

3.2 Local user accounts

Surface: /login, /api/auth/login.
Mitigations: Argon2id password hashing with per-deployment params; constant-time password compare; session-cookie minting via crypto/rand; session rows hash-only (raw token only in cookie).
Rate limiting: Currently not in place at the application layer — the project assumes a reverse proxy enforces login throttling. Recommendation: front the server with caddy/nginx rate-limit rules in production.
Password policy: 12-character minimum on bootstrap and user-setup paths; no maximum, no rotation, no history. Sufficient for self-hosted ops; tighten in policy if a deployment requires it.

3.3 OIDC SSO

Surface: /auth/oidc/* — generic OIDC client, JIT user provisioning.
Mitigations: state + nonce per flow; role mapping is server-configured (claims trusted only to identify the user, not pick role); user-disabled gate runs after IdP success.
Residual risk: misconfigured role-mapping rules can promote any IdP user to admin. Recommendation: review cfg.OIDC.RoleMappings carefully.

3.4 Agent enrolment

Surface: /api/agents/enroll (token-authenticated), /api/agents/announce (anonymous, then operator-approves).
Mitigations:
- Token path: one-shot, hashed at rest, 1h TTL; agent receives a fresh long-lived bearer in the response.
- Announce path: agent supplies an Ed25519 public key; operator sees a fingerprint to confirm out-of-band before accepting.
- Bearer tokens are SHA-256 hashed in the DB.
Residual risk: an attacker on the network between operator and target host who intercepts the install snippet can enrol as the target. The install script must be served over TLS in production (the docker-only deployment defaults to TLS-by-default; bare-metal deployers must configure their own).

3.5 Agent → server WebSocket

Surface: persistent WS authenticated by agent bearer.
Mitigations: bearer is presented per-connection; server pins the agent fingerprint for the announce flow; messages are envelope-typed and rejected if shape-invalid.
No payload-level signing today — TLS is the integrity boundary. A man-in-the-middle with a valid cert chain could swap messages. Recommendation: pin the server cert via RM_SERVER_CERT_PIN_SHA256 if running over a network you don't fully control.

3.6 Repo credential lifecycle

Stored encrypted at rest under the AEAD secret key.
Pushed to the agent over the WS on hello, on creds change, and on demand.
Agent persists them encrypted (per-host secret key derived from a value known only to the agent).
Logged surfaces use restic.RedactURL() to strip user:pass@ from URLs before they reach slog.
Plaintext form is constructed only at exec.Command time inside the agent, never stored on a struct field that could be slogged.

3.7 Restore

Operators can restore to any path the agent (running as root) can write.
Cross-host restore (host A's snapshot → host C) is deferred — see F-01. The current single-host restore does not require granting any cross-host privileges.

3.8 Audit log

Append-only writes from the application; SQLite enforces no schema-level immutability.
A compromise of the SQLite file (via OS-level access) can edit the audit log. Recommendation: ship audit entries to an append-only sink (syslog / Loki / Splunk) if tamper-evidence beyond the OS boundary is required.

3.9 Self-update channel (P6)

Agents fetch new binaries via the WS transport from the server.
Binaries are signature-checked by the agent against a key embedded in the existing agent (see internal/fleetupdate/).
Residual risk: a server compromise lets the attacker push code to every agent (running as root). The signing-key compromise window is the same as the server compromise window because both live on the server. Splitting the signing key onto a separate signer is future work (not v1).

4. Out of scope

Restic itself — its repository format, encryption, and backend protocol are upstream-trusted.
The host OS — root compromise of a host obviously compromises that host's backups.
The backup destination — restic-manager assumes the rest-server / object-store / SFTP target enforces its own auth.
Side-channel attacks on the server process (RAM dump, process tracing).
Physical access to the server's disk.

5. Reporting

Found something we missed? See SECURITY.md for the disclosure process. Coordinated disclosure preferred; the project is maintained by a small team and we'll respond as quickly as we reasonably can.

8.0 KiB Raw Blame History