docs(spec): clarify staleness vs job-failure alerting for asleep hosts

This commit is contained in:
2026-06-15 20:42:00 +01:00
parent 0c3a0844e4
commit 261b83ec26
@@ -37,8 +37,14 @@ Let an operator mark a host as **not** always-on. Such a host:
whether it missed a scheduled backup and — if so — triggers a
catch-up backup automatically.
4. Still raises a *staleness* alert if it has genuinely gone too long
without any backup (a host left in a drawer), and still raises normal
job-failure alerts for backups that run and fail.
without any backup (a host left in a drawer). This is the only
alert covering an asleep host: while the agent is offline no job
runs, so there is no failure to detect — staleness is the safety
net for "no backups are happening at all."
5. Leaves normal job-failure alerting untouched: a backup that
actually runs (scheduled or catch-up) and fails alerts as it does
today. Failures can only occur while the agent is online and
executing restic.
Default behaviour is unchanged for the entire existing fleet.