P5-01 — Documentation site under docs/book/ rendered with mdBook
(downloaded via Makefile, same static-binary pattern as Tailwind).
Structured chapters: getting started, concepts, operations,
security, reference. `make docs` / `make docs-watch`. Generated
output gitignored.
P5-02 — CONTRIBUTING.md rewritten from placeholder to a full
guide. CODE_OF_CONDUCT.md adapted from Contributor Covenant for a
single-maintainer project. .gitea/issue_template/{bug,feature}.md
and PULL_REQUEST_TEMPLATE.md.
P5-04 — Six README screenshots captured live from a fresh server
bootstrap (login, empty dashboard, add-host, alerts, settings,
audit log). README rewritten to centre the screenshot grid and
link out to the docs site.
P5-05 — SECURITY.md with disclosure policy (3-day ack, 30-day
default window), scope in/out, threat-model summary, operator
hardening checklist. Mirrored as a docs-site chapter.
P5-06 — End-to-end test harness. e2e/compose.e2e.yml brings up
server + sibling Linux agent (alpine + restic) + restic/rest-server.
Agent uses announce-and-approve so Playwright can drive the full
operator flow: bootstrap → login → accept pending → backup →
verify terminal status. Second spec scrapes /metrics to assert
the P6-04 endpoint surface. .gitea/workflows/e2e.yml runs on every
PR; local how-to in docs/e2e.md.
3.5 KiB
Repo maintenance
Backups go in; without maintenance, repos grow forever and eventually fall over. restic-manager runs three maintenance operations on a per-host cadence:
| Command | What it does | Default cadence |
|---|---|---|
forget |
Marks snapshots eligible for removal per the retention policy attached to each source group. Cheap; runs append-only. | Daily after the last backup of the day |
prune |
Reclaims space from the repo. Requires the admin credential (write+delete). | Weekly, off-peak |
check |
Verifies repo integrity. Sub-options surface lock state. | Weekly, with --read-data-subset N% to sample pack files |
A new field on each host row, host_repo_maintenance, holds the
cron expressions and last-fire anchors. The maintenance ticker on
the server runs every 60s, finds hosts whose next-fire is due,
and dispatches the right command. The agent's local cron is
only for backups.
Why server-side and not agent-side?
The agent's cron knows about backups because backups are per-source-group. Maintenance is per-repo, not per-source-group, so doing it server-side keeps the per-host wiring simple:
- One ticker, not N agent crons to keep in sync.
- Cancelling a maintenance dispatch is just "don't dispatch the next one" — no agent-side state to clean up.
- Skipping offline hosts is trivial (no queue; only scheduled
backups queue into
pending_runs).
Forget and the multi-group payload
A single forget job can target several source groups at once.
The wire envelope (ForgetGroups) carries one entry per group,
each with its retention policy. The agent runs N
restic forget --tag <name> --keep-... invocations in sequence,
streams their output, and reports a single terminal status.
Prune and the admin credential
Prune mutates the repo. The everyday append-only credential
cannot prune — that's the whole point of append-only.
restic-manager keeps a second slot per host (kind = 'admin')
for the credential that can.
When a prune is dispatched (cadence-driven or operator-driven):
- Server pushes the admin credential to the agent in a fresh
config.update. - Agent runs
restic prunewith the merged credential. - Job finishes; agent discards the admin credential from its in-memory secrets store.
The server never logs the merged URL (see Credentials).
Check and lock state
restic check warns about stale locks when it finds them. The
agent ships every check's output back as a repo.stats envelope
and a stream of log lines; if a stale lock is detected, the
Repo page surfaces a banner with an Unlock button. The
operator-only unlock command runs restic unlock and clears
the banner.
unlock has no cadence — it's a manual action, never automatic.
Auto-unlocking would mask the cause (probably a previously
crashed long-running operation) and risk corrupting an
operation the operator has merely lost track of.
Repo stats
After every backup, check, prune, and unlock, the agent runs
restic stats --json --mode raw-data and ships the result as a
repo.stats envelope. The server stores this in
host_repo_stats (latest only) and host_repo_stats_history
(one row per host per day, last-write-wins per column — a
prune-only patch never nulls a backup-time size).
The host detail page surfaces:
- Total size + raw size in the vitals strip.
- Last-check timestamp + colour-coded status.
- Last-prune timestamp.
- 30/90-day repo size trend chart.