v1 readiness: CHANGELOG + threat model + first-run onboarding polish
- CHANGELOG.md: Keep-a-Changelog format, v1.0.0 entry summarising what each phase delivered. - docs/threat-model.md: structured walkthrough of assets, actors, attack surfaces and residual risks; reviewed against v1.0.0. - cmd/server/main.go: at first-run startup, print a clickable $RM_BASE_URL/bootstrap URL alongside the existing one-shot bootstrap token (or a fallback hint when RM_BASE_URL is unset). - web/templates/pages/bootstrap.html: visible "Minimum 12 characters" hint under the password field so the rule is communicated before the operator submits. - tasks.md: close X-01, X-04, X-05 with notes.
This commit is contained in:
@@ -0,0 +1,126 @@
|
||||
# Threat model
|
||||
|
||||
A short, structured walkthrough of the assets restic-manager
|
||||
protects, the actors that interact with it, the attack surfaces
|
||||
exposed, and the mitigations in place. This document is written for
|
||||
operators considering a deployment and for contributors evaluating
|
||||
security-sensitive changes. It is **not** a formal certification —
|
||||
restic-manager has not been third-party audited.
|
||||
|
||||
Last reviewed: **2026-05-09** (against v1.0.0).
|
||||
|
||||
---
|
||||
|
||||
## 1. Assets
|
||||
|
||||
In rough order of sensitivity:
|
||||
|
||||
| Asset | Why it matters |
|
||||
|---|---|
|
||||
| **Restic repository passwords** | Decrypt every backup in the repo. Server holds them encrypted at rest; agents need plaintext at backup-time. |
|
||||
| **Repository URLs with embedded credentials** (e.g. `rest:https://user:pass@host/repo`) | Same as above — read access to the repo is leak-equivalent to the password. |
|
||||
| **Agent bearer tokens** | Long-lived credentials authenticating each agent → server WS. Compromise lets an attacker impersonate that host (push fake snapshots, ack fake schedule versions, exfiltrate repo creds the server pushes back). |
|
||||
| **Server session cookies** | Browser-side session for human operators. Compromise = full UI access at the user's role for the cookie's TTL (24h). |
|
||||
| **Database secret key** | Wraps every encrypted-at-rest field (repo creds, agent enrolment payloads). Loss of the file means decryptable backups; rotation requires re-pushing creds to every agent. |
|
||||
| **Bootstrap / setup tokens** | One-shot, time-limited; mint admin or invited-user accounts. |
|
||||
| **Audit log** | Tamper-evident record of admin actions; read-only via UI. |
|
||||
| **Backup data on the wire** | Restic itself encrypts on the agent before sending — see "out of scope". |
|
||||
|
||||
---
|
||||
|
||||
## 2. Actors
|
||||
|
||||
| Actor | Trust |
|
||||
|---|---|
|
||||
| **Anonymous internet** | Untrusted. Should not reach the server unless proxied behind auth (see deployment guide). |
|
||||
| **Authenticated viewer** | Read-only on hosts/jobs/alerts/audit. |
|
||||
| **Authenticated operator** | Add/remove hosts, edit schedules, run backups/restores, mint enrolment tokens, ack alerts. |
|
||||
| **Authenticated admin** | All of the above plus user management, role changes, fleet update controls, secret-key visibility (no — see below). |
|
||||
| **Agent** | Trusted to backup-and-report on its own host only. Cannot read other hosts' creds. Bearer-authenticated. |
|
||||
| **Restic backend (rest-server / S3 / B2 / etc.)** | Out of scope for this document — assumed to authenticate the credentials presented and not collude. |
|
||||
|
||||
---
|
||||
|
||||
## 3. Attack surfaces and mitigations
|
||||
|
||||
### 3.1 First-run bootstrap
|
||||
|
||||
- **Surface**: `/bootstrap` UI + `/api/bootstrap` JSON endpoint.
|
||||
- **Risk**: race between server start and admin creation — an attacker who reaches the server first can claim admin.
|
||||
- **Mitigations**:
|
||||
- Bootstrap token printed to stderr exactly once; held in memory, not persisted.
|
||||
- The UI form on `/bootstrap` uses the in-memory token automatically (no token field for the operator to type or expose).
|
||||
- Both surfaces self-disable the moment any user row exists (`CountUsers > 0`).
|
||||
- Token is also blanked from process memory after success (defence in depth).
|
||||
- **Residual risk**: if an operator brings up the server on the public internet before reaching the bootstrap page, an attacker reaching `/bootstrap` first wins. **Recommendation**: bring the server up behind an existing trusted network or with the listener bound to `127.0.0.1` until first-run is complete.
|
||||
|
||||
### 3.2 Local user accounts
|
||||
|
||||
- **Surface**: `/login`, `/api/auth/login`.
|
||||
- **Mitigations**: Argon2id password hashing with per-deployment params; constant-time password compare; session-cookie minting via `crypto/rand`; session rows hash-only (raw token only in cookie).
|
||||
- **Rate limiting**: Currently not in place at the application layer — the project assumes a reverse proxy enforces login throttling. **Recommendation**: front the server with `caddy`/`nginx` rate-limit rules in production.
|
||||
- **Password policy**: 12-character minimum on bootstrap and user-setup paths; no maximum, no rotation, no history. Sufficient for self-hosted ops; tighten in policy if a deployment requires it.
|
||||
|
||||
### 3.3 OIDC SSO
|
||||
|
||||
- **Surface**: `/auth/oidc/*` — generic OIDC client, JIT user provisioning.
|
||||
- **Mitigations**: state + nonce per flow; role mapping is server-configured (claims trusted only to identify the user, not pick role); user-disabled gate runs after IdP success.
|
||||
- **Residual risk**: misconfigured role-mapping rules can promote any IdP user to admin. **Recommendation**: review `cfg.OIDC.RoleMappings` carefully.
|
||||
|
||||
### 3.4 Agent enrolment
|
||||
|
||||
- **Surface**: `/api/agents/enroll` (token-authenticated), `/api/agents/announce` (anonymous, then operator-approves).
|
||||
- **Mitigations**:
|
||||
- Token path: one-shot, hashed at rest, 1h TTL; agent receives a fresh long-lived bearer in the response.
|
||||
- Announce path: agent supplies an Ed25519 public key; operator sees a fingerprint to confirm out-of-band before accepting.
|
||||
- Bearer tokens are SHA-256 hashed in the DB.
|
||||
- **Residual risk**: an attacker on the network between operator and target host who intercepts the install snippet can enrol *as* the target. The install script must be served over TLS in production (the docker-only deployment defaults to TLS-by-default; bare-metal deployers must configure their own).
|
||||
|
||||
### 3.5 Agent → server WebSocket
|
||||
|
||||
- **Surface**: persistent WS authenticated by agent bearer.
|
||||
- **Mitigations**: bearer is presented per-connection; server pins the agent fingerprint for the announce flow; messages are envelope-typed and rejected if shape-invalid.
|
||||
- **No payload-level signing** today — TLS is the integrity boundary. A man-in-the-middle with a valid cert chain could swap messages. **Recommendation**: pin the server cert via `RM_SERVER_CERT_PIN_SHA256` if running over a network you don't fully control.
|
||||
|
||||
### 3.6 Repo credential lifecycle
|
||||
|
||||
- Stored encrypted at rest under the AEAD secret key.
|
||||
- Pushed to the agent over the WS on hello, on creds change, and on demand.
|
||||
- Agent persists them encrypted (per-host secret key derived from a value known only to the agent).
|
||||
- Logged surfaces use `restic.RedactURL()` to strip `user:pass@` from URLs before they reach `slog`.
|
||||
- Plaintext form is constructed only at `exec.Command` time inside the agent, never stored on a struct field that could be slogged.
|
||||
|
||||
### 3.7 Restore
|
||||
|
||||
- Operators can restore to any path the agent (running as root) can write.
|
||||
- Cross-host restore (host A's snapshot → host C) is **deferred** — see F-01. The current single-host restore does not require granting any cross-host privileges.
|
||||
|
||||
### 3.8 Audit log
|
||||
|
||||
- Append-only writes from the application; SQLite enforces no schema-level immutability.
|
||||
- A compromise of the SQLite file (via OS-level access) can edit the audit log. **Recommendation**: ship audit entries to an append-only sink (syslog / Loki / Splunk) if tamper-evidence beyond the OS boundary is required.
|
||||
|
||||
### 3.9 Self-update channel (P6)
|
||||
|
||||
- Agents fetch new binaries via the WS transport from the server.
|
||||
- Binaries are signature-checked by the agent against a key embedded in the existing agent (see `internal/fleetupdate/`).
|
||||
- **Residual risk**: a server compromise lets the attacker push code to every agent (running as root). The signing-key compromise window is the same as the server compromise window because both live on the server. Splitting the signing key onto a separate signer is future work (not v1).
|
||||
|
||||
---
|
||||
|
||||
## 4. Out of scope
|
||||
|
||||
- **Restic itself** — its repository format, encryption, and backend protocol are upstream-trusted.
|
||||
- **The host OS** — root compromise of a host obviously compromises that host's backups.
|
||||
- **The backup destination** — restic-manager assumes the rest-server / object-store / SFTP target enforces its own auth.
|
||||
- **Side-channel attacks** on the server process (RAM dump, process tracing).
|
||||
- **Physical access** to the server's disk.
|
||||
|
||||
---
|
||||
|
||||
## 5. Reporting
|
||||
|
||||
Found something we missed? See `SECURITY.md` for the disclosure
|
||||
process. Coordinated disclosure preferred; the project is
|
||||
maintained by a small team and we'll respond as quickly as we
|
||||
reasonably can.
|
||||
Reference in New Issue
Block a user