diff --git a/docs/specs/2026-06-15-always-on-host-mode-design.md b/docs/specs/2026-06-15-always-on-host-mode-design.md index a10a089..b48c8e6 100644 --- a/docs/specs/2026-06-15-always-on-host-mode-design.md +++ b/docs/specs/2026-06-15-always-on-host-mode-design.md @@ -37,8 +37,14 @@ Let an operator mark a host as **not** always-on. Such a host: whether it missed a scheduled backup and — if so — triggers a catch-up backup automatically. 4. Still raises a *staleness* alert if it has genuinely gone too long - without any backup (a host left in a drawer), and still raises normal - job-failure alerts for backups that run and fail. + without any backup (a host left in a drawer). This is the only + alert covering an asleep host: while the agent is offline no job + runs, so there is no failure to detect — staleness is the safety + net for "no backups are happening at all." +5. Leaves normal job-failure alerting untouched: a backup that + actually runs (scheduled or catch-up) and fails alerts as it does + today. Failures can only occur while the agent is online and + executing restic. Default behaviour is unchanged for the entire existing fleet.