tasks: tick P3-05/06/07 + Playwright sweep notes
Sweep against the live smoke env confirmed the alerts subsystem end-to-end: three channels (webhook → local sink, ntfy → ntfy.sh, SMTP → MailHog) created and verified via the Test button; synthetic critical raised; ack + resolve fan out alert.acknowledged / alert.resolved across all three; dashboard banner appears and clears; nav badge tracks open count. Three real bugs found and fixed mid-sweep — see preceding three commits for the full reasoning.
This commit is contained in:
@@ -270,11 +270,13 @@ Sizes: **S** = under a day, **M** = 1–3 days, **L** = 3–7 days.
|
|||||||
|
|
||||||
> **As shipped (Playwright sweep against the live smoke env, 2026-05-04):** login → host detail → Restore button → wizard step 1 picks snapshot a1ac4006 (most recent) → tree drill-down `/home/steve/test` (3 lazy loads) → tick `file1` + `file2` → step 4 confirm summary populated → dispatch → live job page with running progress widget → restore succeeds, files land on disk at `/root/rm-restore/<job-id>/home/steve/test/file{1,2}` (default `$HOME/rm-restore/<job-id>/` after agent-side expansion). Custom-target restore to `/tmp/custom-restore/<job-id>/` lands inside the agent's `PrivateTmp` namespace. Snapshot diff between `a1ac4006` and `5f78c788` → diff job page, statistics output streamed (738 bytes added, 0 removed). Recent-restores line on host detail reads "last restore · succeeded 28s ago · job log →". Download dropdown serves both `.txt` and `.ndjson` with correct `Content-Type` + `Content-Disposition`. SIZE/FILES tooltip "Needs restic 0.17+ on the agent host. This host runs 0.16.4." renders on column hover.
|
> **As shipped (Playwright sweep against the live smoke env, 2026-05-04):** login → host detail → Restore button → wizard step 1 picks snapshot a1ac4006 (most recent) → tree drill-down `/home/steve/test` (3 lazy loads) → tick `file1` + `file2` → step 4 confirm summary populated → dispatch → live job page with running progress widget → restore succeeds, files land on disk at `/root/rm-restore/<job-id>/home/steve/test/file{1,2}` (default `$HOME/rm-restore/<job-id>/` after agent-side expansion). Custom-target restore to `/tmp/custom-restore/<job-id>/` lands inside the agent's `PrivateTmp` namespace. Snapshot diff between `a1ac4006` and `5f78c788` → diff job page, statistics output streamed (738 bytes added, 0 removed). Recent-restores line on host detail reads "last restore · succeeded 28s ago · job log →". Download dropdown serves both `.txt` and `.ndjson` with correct `Content-Type` + `Content-Disposition`. SIZE/FILES tooltip "Needs restic 0.17+ on the agent host. This host runs 0.16.4." renders on column hover.
|
||||||
|
|
||||||
### Phase 3 — Alerts (not started)
|
### Phase 3 — Alerts ✅
|
||||||
|
|
||||||
- [ ] **P3-05** (M) Alert engine: rule evaluation loop (failed backup, stale schedule, agent offline, check failed)
|
- [x] **P3-05** (M) Alert engine: rule evaluation loop (failed backup, stale schedule, agent offline, check failed)
|
||||||
- [ ] **P3-06** (M) Notification channels: webhook, ntfy, SMTP email
|
- [x] **P3-06** (M) Notification channels: webhook, ntfy, SMTP email
|
||||||
- [ ] **P3-07** (S) Alert UI: list, acknowledge, resolve
|
- [x] **P3-07** (S) Alert UI: list, acknowledge, resolve
|
||||||
|
|
||||||
|
> **As shipped (Playwright sweep, 2026-05-04):** /settings/notifications → 3 channels created (sweep-webhook → local Python sink, sweep-ntfy → ntfy.sh public topic, sweep-smtp → MailHog at 127.0.0.1:1025). Test buttons fire alert.test on each: webhook 200/1ms, ntfy 200/322ms, SMTP 250/3ms. Synthetic critical `backup_failed` raised → /alerts shows row with severity dot, kind chip, host, message, raised/last-seen, Ack + Resolve buttons; nav badge `1`; dashboard critical-alert banner appears with Review→ link; OPEN ALERTS card reads `1 unresolved`. Acknowledge → fan-out to all 3 channels emits alert.acknowledged (verified in webhook sink, MailHog inbox, notification_log); Acknowledged tab shows row with `ack'd by <user>` line. Resolve → fan-out emits alert.resolved across all 3 channels; banner clears; dashboard reads `0 unresolved · all clear`; host alerts column reads —. Three live bugs found and fixed mid-sweep: (a) `enabled` form value lost because hidden+checkbox both named `enabled` and `PostForm.Get` returned the first ("0"); (b) Ack/Resolve handlers stored the state change but never dispatched alert.acknowledged / alert.resolved; (c) `hosts.open_alert_count` projection was never recomputed on Raise/Resolve/AutoResolve, so the dashboard count always read 0.
|
||||||
|
|
||||||
### Phase 3 — Audit log UI (not started)
|
### Phase 3 — Audit log UI (not started)
|
||||||
|
|
||||||
|
|||||||
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user