P5: OSS readiness — docs site, contributor onboarding, e2e harness

P5-01 — Documentation site under docs/book/ rendered with mdBook
(downloaded via Makefile, same static-binary pattern as Tailwind).
Structured chapters: getting started, concepts, operations,
security, reference. `make docs` / `make docs-watch`. Generated
output gitignored.

P5-02 — CONTRIBUTING.md rewritten from placeholder to a full
guide. CODE_OF_CONDUCT.md adapted from Contributor Covenant for a
single-maintainer project. .gitea/issue_template/{bug,feature}.md
and PULL_REQUEST_TEMPLATE.md.

P5-04 — Six README screenshots captured live from a fresh server
bootstrap (login, empty dashboard, add-host, alerts, settings,
audit log). README rewritten to centre the screenshot grid and
link out to the docs site.

P5-05 — SECURITY.md with disclosure policy (3-day ack, 30-day
default window), scope in/out, threat-model summary, operator
hardening checklist. Mirrored as a docs-site chapter.

P5-06 — End-to-end test harness. e2e/compose.e2e.yml brings up
server + sibling Linux agent (alpine + restic) + restic/rest-server.
Agent uses announce-and-approve so Playwright can drive the full
operator flow: bootstrap → login → accept pending → backup →
verify terminal status. Second spec scrapes /metrics to assert
the P6-04 endpoint surface. .gitea/workflows/e2e.yml runs on every
PR; local how-to in docs/e2e.md.
This commit is contained in:
2026-05-07 23:56:02 +01:00
parent ff8a5dbead
commit bb4ed3502d
47 changed files with 2818 additions and 61 deletions
+110
View File
@@ -0,0 +1,110 @@
# Threat model
This page documents what restic-manager defends against, what it
doesn't, and the trust assumptions a deployment is making. The
canonical version lives in [`spec.md`](https://gitea.dcglab.co.uk/steve/restic-manager/src/branch/main/spec.md)
§11; the summary here is shaped for operators rather than
implementers.
## Trust boundaries
```
┌──────────────────────────────────────────┐
│ TRUSTED zone │
│ ┌─────────────┐ ┌──────────────┐ │
│ │ Operator's │ │ Reverse │ │
│ │ browser │◄──►│ proxy │ │ TLS terminates here
│ └─────────────┘ └──────┬───────┘ │
└────────────────────────────┼─────────────┘
│ HTTP, plaintext
│ (loopback or trusted LAN)
┌────────────────────────────▼─────────────┐
│ Server (control plane) │
└────────────┬─────────────────────────────┘
│ outbound WebSocket (TLS to clients via proxy)
│ — bearer-authenticated
┌────────────▼──────────────┐
│ Agent (per host) │ ◄── attacker model: assume one
└────────────┬──────────────┘ endpoint can be compromised
│ subprocess
restic ──▶ repository (rest-server / S3 / SFTP / …)
```
## What we defend against
### Network attacker between operator and server
- HTTPS via the reverse proxy is the only operator-facing surface
on a sane deployment.
- `RM_COOKIE_SECURE=true` (default) means the session cookie
refuses to ride a non-HTTPS connection.
- `RM_TRUSTED_PROXY` gates whether `X-Forwarded-*` is honoured;
a bypassing request can't spoof the client IP.
### Compromised agent host
- The agent's bearer token can dispatch commands **only on its
own host**. It can't read other hosts' state, dispatch jobs
on other hosts, or escalate within the control plane.
- If you suspect a host compromise:
1. Disable the agent's host row from **Hosts → Delete**
(cascades the bearer hash).
2. Rotate the repo credential at the rest-server / object
store side.
3. Audit-log lists every action that bearer ever drove.
### DB compromise without the secret key
- Repo credentials are AEAD-encrypted at rest. A DB dump alone
doesn't expose them.
- Agent bearer **hashes** are leaked; that's enough to
authenticate as any agent until you revoke. A rotation
procedure is just "delete + re-enrol" today.
- Operator passwords are bcrypt-hashed; OIDC users have no
password to leak.
- Session tokens are hashed; an attacker can't replay a
session from a DB dump.
### DB compromise WITH the secret key
The attacker can decrypt every credential. Treat
`secret.key` with the same care as a password manager database.
Back it up to a separate vault, not to the same Docker volume
as the database.
### Forget/prune as a DoS vector
- The everyday backup credential cannot prune (append-only).
- The admin credential is only pushed to the agent at the
moment of dispatch and discarded after the job ends.
- Compromise of a single agent host does **not** grant prune
rights — at worst the attacker gets fresh write access until
the credential is rotated.
### Operator-side typo or bad copy-paste
- Repo credentials are stored encrypted; mis-typed creds fail
fast on the next `restic` invocation rather than silently
corrupting state.
- NS-03 added auto-init: the first dispatched job after creds
change runs `restic init`, surfaces the error eagerly under
the host's vitals strip if the creds are bad, and resets the
host's `repo_status` so the operator can retry without
hunting through job logs.
## What we don't defend against
- **Insider threat at the maintainer level.** A malicious
maintainer can publish a backdoored container; SBOM /
signing infrastructure (Phase 6 candidate) would help here
but isn't shipped today.
- **Supply chain.** We pin module versions (`go.sum`) and
pin the Tailwind binary's release tag, but a compromise in
one of those upstreams would land here.
- **Side-channel via restic itself.** A bug in restic that
enables snapshot-content disclosure is restic's problem; the
control plane doesn't see snapshot bytes either way.
- **DoS via resource exhaustion** without the recommended
reverse-proxy / rate-limit in front. Don't expose the
server's HTTP port to the public internet directly.