1061 lines
38 KiB
Markdown
1061 lines
38 KiB
Markdown
# Always-On vs Intermittent Host Mode — Implementation Plan
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** Let an operator mark a host as not-always-on so it stops raising offline alerts when it legitimately sleeps, renders a calm "asleep" state, auto-catches-up a missed backup ~1 minute after it reconnects, and still raises a long-threshold staleness alert if it goes too long with no backup.
|
||
|
||
**Architecture:** A thin policy + presentation layer over the existing online/offline state machine. A new `hosts.always_on` boolean (default 1 = today's behaviour) gates three behaviours: offline-alert suppression + a 7-day staleness alert in the alert engine; an in-memory catch-up scheduler in the HTTP server armed on agent hello and fired from the existing 30s tick; and an "asleep" UI state plus a 24×7 chip. Online/offline tracking, heartbeat, and `pending_runs` are untouched.
|
||
|
||
**Tech Stack:** Go, SQLite (modernc), `github.com/robfig/cron/v3` (already a dependency), Go `html/template`, Tailwind-in-`input.css`.
|
||
|
||
**Spec:** `docs/specs/2026-06-15-always-on-host-mode-design.md`
|
||
|
||
---
|
||
|
||
## File Structure
|
||
|
||
- **Create** `internal/store/migrations/0024_hosts_always_on.sql` — add the column.
|
||
- **Modify** `internal/store/types.go` — add `Host.AlwaysOn bool`.
|
||
- **Modify** `internal/store/hosts.go` — add `always_on` to the 3 host SELECTs + `scanHostRow`; add `SetHostAlwaysOn`.
|
||
- **Create** `internal/store/hosts_always_on_test.go` — round-trip + default test.
|
||
- **Modify** `internal/alert/engine.go` — suppress offline for intermittent hosts; staleness sweep; resolve staleness on backup success.
|
||
- **Modify** `internal/alert/rules.go` — exported `ResolveKind` helper for the toggle handler; staleness threshold constant.
|
||
- **Create** `internal/alert/intermittent_test.go` — suppression + staleness + resolve tests.
|
||
- **Create** `internal/server/http/catchup.go` — overdue helper + in-memory catch-up scheduler.
|
||
- **Create** `internal/server/http/catchup_test.go` — overdue table tests.
|
||
- **Modify** `internal/server/http/server.go` — catch-up map fields on `Server`, init in `New`.
|
||
- **Modify** `internal/server/http/host_credentials.go` — arm catch-up in `onAgentHello`.
|
||
- **Modify** `cmd/server/main.go` — call `srv.RunCatchupsDue` on the pending-drain tick.
|
||
- **Modify** `internal/server/http/ui_handlers.go` — `handleUIHostModeSave` handler.
|
||
- **Modify** `internal/server/http/server.go` (routes) — mount `POST /hosts/{id}/mode`.
|
||
- **Modify** `web/styles/input.css` — `dot-asleep` token.
|
||
- **Modify** `web/templates/partials/host_row.html` — asleep dot + text.
|
||
- **Modify** `web/templates/partials/host_chrome.html` — asleep dot/last-seen, 24×7 chip, mode toggle form.
|
||
- **Modify** `tasks.md` — record the feature.
|
||
|
||
---
|
||
|
||
## Task 1: Schema + store field for `always_on`
|
||
|
||
**Files:**
|
||
- Create: `internal/store/migrations/0024_hosts_always_on.sql`
|
||
- Modify: `internal/store/types.go:62-102` (Host struct)
|
||
- Modify: `internal/store/hosts.go` (3 SELECTs at lines 41-48, 56-63, 224-231; `scanHostRow` at 261-334)
|
||
- Test: `internal/store/hosts_always_on_test.go`
|
||
|
||
- [ ] **Step 1: Write the migration**
|
||
|
||
Create `internal/store/migrations/0024_hosts_always_on.sql`:
|
||
|
||
```sql
|
||
-- 0024: distinguish always-on (24x7 server) hosts from intermittent
|
||
-- hosts (laptops/workstations that legitimately sleep). Default 1 so
|
||
-- every existing and future host keeps today's offline/alert
|
||
-- semantics unless explicitly opted out. Column-level ALTER per the
|
||
-- repo's migration rules (no table rebuild — hosts has inbound FKs).
|
||
ALTER TABLE hosts ADD COLUMN always_on INTEGER NOT NULL DEFAULT 1;
|
||
```
|
||
|
||
- [ ] **Step 2: Add the struct field**
|
||
|
||
In `internal/store/types.go`, add to the `Host` struct (after `RepoStatusError` at line 101):
|
||
|
||
```go
|
||
// AlwaysOn is true for 24x7 server hosts (the default). When false
|
||
// the host is intermittent (laptop/workstation): offline alerts are
|
||
// suppressed, the UI shows an "asleep" state, and a missed backup is
|
||
// caught up ~1 min after reconnect. See the always-on-host-mode spec.
|
||
AlwaysOn bool
|
||
```
|
||
|
||
- [ ] **Step 3: Thread `always_on` through reads**
|
||
|
||
In `internal/store/hosts.go`, append `, always_on` to the SELECT column list in all three queries: `LookupHostByAgentToken` (line 47), `GetHost` (line 62), and `ListHosts` (line 230). Each currently ends `repo_status, repo_status_error` — change to `repo_status, repo_status_error, always_on`.
|
||
|
||
Then in `scanHostRow` (line 261), add scanning. Add a local var and the scan target. Change the `Scan(...)` call's final args from `&h.RepoStatus, &h.RepoStatusError)` to `&h.RepoStatus, &h.RepoStatusError, &alwaysOn)` and declare `var alwaysOn int` in the var block, then after the existing post-scan assignments add:
|
||
|
||
```go
|
||
h.AlwaysOn = alwaysOn != 0
|
||
```
|
||
|
||
(SQLite stores the boolean as INTEGER; scan into int then compare to avoid driver bool-coercion surprises.)
|
||
|
||
- [ ] **Step 4: Add `SetHostAlwaysOn`**
|
||
|
||
In `internal/store/hosts.go`, after `SetHostTags` (line 379), add:
|
||
|
||
```go
|
||
// SetHostAlwaysOn flips the host's always-on flag. true = 24x7 server
|
||
// (default); false = intermittent host (laptop). See the
|
||
// always-on-host-mode spec.
|
||
func (s *Store) SetHostAlwaysOn(ctx context.Context, hostID string, alwaysOn bool) error {
|
||
v := 0
|
||
if alwaysOn {
|
||
v = 1
|
||
}
|
||
_, err := s.db.ExecContext(ctx,
|
||
`UPDATE hosts SET always_on = ? WHERE id = ?`, v, hostID)
|
||
if err != nil {
|
||
return fmt.Errorf("store: set host always_on: %w", err)
|
||
}
|
||
return nil
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Write the round-trip test**
|
||
|
||
Create `internal/store/hosts_always_on_test.go`. Use the existing test harness pattern — check a sibling test (e.g. `internal/store/hosts_test.go`) for the `newTestStore`/`testStore` helper name and the host-creation helper, and mirror it exactly. The test body:
|
||
|
||
```go
|
||
package store
|
||
|
||
import (
|
||
"context"
|
||
"testing"
|
||
"time"
|
||
)
|
||
|
||
func TestHostAlwaysOnDefaultAndToggle(t *testing.T) {
|
||
ctx := context.Background()
|
||
st := newTestStore(t) // mirror the helper used by hosts_test.go
|
||
|
||
h := Host{
|
||
ID: "h-always-on", Name: "lap", OS: "linux", Arch: "amd64",
|
||
ProtocolVersion: 1, EnrolledAt: time.Now().UTC(),
|
||
}
|
||
if err := st.CreateHost(ctx, h, "tok-hash", "pin"); err != nil {
|
||
t.Fatalf("create host: %v", err)
|
||
}
|
||
|
||
got, err := st.GetHost(ctx, h.ID)
|
||
if err != nil {
|
||
t.Fatalf("get host: %v", err)
|
||
}
|
||
if !got.AlwaysOn {
|
||
t.Fatalf("new host should default to always_on=true, got false")
|
||
}
|
||
|
||
if err := st.SetHostAlwaysOn(ctx, h.ID, false); err != nil {
|
||
t.Fatalf("set always_on: %v", err)
|
||
}
|
||
got, err = st.GetHost(ctx, h.ID)
|
||
if err != nil {
|
||
t.Fatalf("get host 2: %v", err)
|
||
}
|
||
if got.AlwaysOn {
|
||
t.Fatalf("expected always_on=false after toggle, got true")
|
||
}
|
||
|
||
// ListHosts must surface the same value.
|
||
hosts, err := st.ListHosts(ctx)
|
||
if err != nil {
|
||
t.Fatalf("list hosts: %v", err)
|
||
}
|
||
if len(hosts) != 1 || hosts[0].AlwaysOn {
|
||
t.Fatalf("ListHosts should report always_on=false, got %+v", hosts)
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 6: Run the test (expect FAIL first if written before code, else PASS)**
|
||
|
||
Run: `go test ./internal/store/ -run TestHostAlwaysOnDefaultAndToggle -v`
|
||
Expected: PASS once Steps 1-4 are in. If you wrote the test first, it fails to compile on `AlwaysOn` / `SetHostAlwaysOn` — that is the expected red.
|
||
|
||
- [ ] **Step 7: Commit**
|
||
|
||
```bash
|
||
go vet ./internal/store/...
|
||
git add internal/store/migrations/0024_hosts_always_on.sql internal/store/types.go internal/store/hosts.go internal/store/hosts_always_on_test.go
|
||
git commit -m "feat(store): add hosts.always_on flag (default on)"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 2: Overdue computation helper
|
||
|
||
This is a pure function so it can be unit-tested in isolation before the scheduler wires it up. It lives in the new `catchup.go` (the scheduler will follow in Task 3, same file).
|
||
|
||
**Files:**
|
||
- Create: `internal/server/http/catchup.go`
|
||
- Test: `internal/server/http/catchup_test.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
Create `internal/server/http/catchup_test.go`:
|
||
|
||
```go
|
||
package http
|
||
|
||
import (
|
||
"testing"
|
||
"time"
|
||
)
|
||
|
||
func TestScheduleOverdue(t *testing.T) {
|
||
mustParse := func(s string) time.Time {
|
||
t.Helper()
|
||
v, err := time.Parse(time.RFC3339, s)
|
||
if err != nil {
|
||
t.Fatalf("parse %q: %v", s, err)
|
||
}
|
||
return v
|
||
}
|
||
daily := "0 2 * * *" // 02:00 every day
|
||
|
||
cases := []struct {
|
||
name string
|
||
cron string
|
||
lastBackup *time.Time
|
||
now time.Time
|
||
want bool
|
||
}{
|
||
{
|
||
name: "never backed up is overdue",
|
||
cron: daily, lastBackup: nil,
|
||
now: mustParse("2026-06-15T09:00:00Z"),
|
||
want: true,
|
||
},
|
||
{
|
||
name: "missed last nights window",
|
||
cron: daily,
|
||
lastBackup: ptrTime(mustParse("2026-06-13T02:05:00Z")),
|
||
now: mustParse("2026-06-15T09:00:00Z"),
|
||
want: true,
|
||
},
|
||
{
|
||
name: "backed up after the most recent window",
|
||
cron: daily,
|
||
lastBackup: ptrTime(mustParse("2026-06-15T02:05:00Z")),
|
||
now: mustParse("2026-06-15T09:00:00Z"),
|
||
want: false,
|
||
},
|
||
{
|
||
name: "unparseable cron is never overdue",
|
||
cron: "not a cron",
|
||
lastBackup: nil,
|
||
now: mustParse("2026-06-15T09:00:00Z"),
|
||
want: false,
|
||
},
|
||
}
|
||
for _, c := range cases {
|
||
t.Run(c.name, func(t *testing.T) {
|
||
got := scheduleOverdue(c.cron, c.lastBackup, c.now)
|
||
if got != c.want {
|
||
t.Fatalf("scheduleOverdue(%q, %v, %v) = %v, want %v",
|
||
c.cron, c.lastBackup, c.now, got, c.want)
|
||
}
|
||
})
|
||
}
|
||
}
|
||
|
||
func ptrTime(t time.Time) *time.Time { return &t }
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test to verify it fails**
|
||
|
||
Run: `go test ./internal/server/http/ -run TestScheduleOverdue -v`
|
||
Expected: FAIL — `undefined: scheduleOverdue`.
|
||
|
||
- [ ] **Step 3: Implement `scheduleOverdue`**
|
||
|
||
Create `internal/server/http/catchup.go` with the helper (the scheduler methods are added in Task 3):
|
||
|
||
```go
|
||
// catchup.go — server-side catch-up for intermittent (non-always-on)
|
||
// hosts. When such a host reconnects we wait a short settle window,
|
||
// then dispatch a backup for any schedule whose window elapsed while
|
||
// the host was asleep. This is separate from pending_runs: a host that
|
||
// was asleep never fired its local cron, so no pending row exists.
|
||
package http
|
||
|
||
import (
|
||
"time"
|
||
)
|
||
|
||
// scheduleOverdue reports whether a schedule's most recent expected
|
||
// fire is newer than the host's last successful backup — i.e. a window
|
||
// passed with no backup. A nil lastBackup means "never backed up" and
|
||
// is always overdue (provided the cron parses). An unparseable cron is
|
||
// treated as not-overdue so a bad expression can never trigger a
|
||
// surprise dispatch. Uses the same cronParser the agent's scheduler
|
||
// and schedule validation use, so interpretation is identical.
|
||
func scheduleOverdue(cronExpr string, lastBackup *time.Time, now time.Time) bool {
|
||
sched, err := cronParser.Parse(cronExpr)
|
||
if err != nil {
|
||
return false
|
||
}
|
||
if lastBackup == nil {
|
||
return true
|
||
}
|
||
next := sched.Next(*lastBackup)
|
||
return !next.After(now)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run the test to verify it passes**
|
||
|
||
Run: `go test ./internal/server/http/ -run TestScheduleOverdue -v`
|
||
Expected: PASS (all four sub-cases).
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
go vet ./internal/server/http/...
|
||
git add internal/server/http/catchup.go internal/server/http/catchup_test.go
|
||
git commit -m "feat(catchup): scheduleOverdue helper for missed-window detection"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 3: Catch-up scheduler (arm on hello, fire on tick)
|
||
|
||
**Files:**
|
||
- Modify: `internal/server/http/server.go:68-93` (Server struct), `:96-112` (New)
|
||
- Modify: `internal/server/http/catchup.go` (add scheduler methods)
|
||
- Modify: `internal/server/http/host_credentials.go:463-486` (onAgentHello)
|
||
- Modify: `cmd/server/main.go:228-229` (pending-drain tick case)
|
||
|
||
- [ ] **Step 1: Add catch-up state to the Server struct**
|
||
|
||
In `internal/server/http/server.go`, add fields to `Server` (after `treeCache` at line 92):
|
||
|
||
```go
|
||
// catchupDueAt tracks intermittent hosts that reconnected and are
|
||
// in their settle window. Keyed hostID → earliest time to evaluate
|
||
// catch-up. Best-effort + in-memory: a server restart simply re-arms
|
||
// on the next hello. Guarded by catchupMu.
|
||
catchupMu sync.Mutex
|
||
catchupDueAt map[string]time.Time
|
||
```
|
||
|
||
Add `"time"` to the imports if not already present (check the import block).
|
||
|
||
- [ ] **Step 2: Initialise the map in New**
|
||
|
||
In `New` (line 106), add to the `&Server{...}` literal:
|
||
|
||
```go
|
||
catchupDueAt: make(map[string]time.Time),
|
||
```
|
||
|
||
- [ ] **Step 3: Add scheduler methods to catchup.go**
|
||
|
||
Append to `internal/server/http/catchup.go`. Add `"context"`, `"log/slog"` to its imports:
|
||
|
||
```go
|
||
// catchupSettle is how long after a reconnect we wait before evaluating
|
||
// catch-up, so a laptop that wakes briefly and sleeps again doesn't
|
||
// trigger a backup it can't finish. ~1 minute per the spec.
|
||
const catchupSettle = 60 * time.Second
|
||
|
||
// ArmCatchup records that an intermittent host just reconnected and
|
||
// should be evaluated for a missed backup after the settle window.
|
||
// No-op for always-on hosts (caller passes only intermittent hosts).
|
||
// Re-arming overwrites the timer (debounce — flapping doesn't stack).
|
||
func (s *Server) ArmCatchup(hostID string, now time.Time) {
|
||
s.catchupMu.Lock()
|
||
defer s.catchupMu.Unlock()
|
||
if s.catchupDueAt == nil {
|
||
s.catchupDueAt = make(map[string]time.Time)
|
||
}
|
||
s.catchupDueAt[hostID] = now.Add(catchupSettle)
|
||
}
|
||
|
||
// dueCatchups returns the hostIDs whose settle window has elapsed and
|
||
// removes them from the map. Caller evaluates each.
|
||
func (s *Server) dueCatchups(now time.Time) []string {
|
||
s.catchupMu.Lock()
|
||
defer s.catchupMu.Unlock()
|
||
var due []string
|
||
for id, at := range s.catchupDueAt {
|
||
if !now.Before(at) {
|
||
due = append(due, id)
|
||
delete(s.catchupDueAt, id)
|
||
}
|
||
}
|
||
return due
|
||
}
|
||
|
||
// RunCatchupsDue is the tick entrypoint. For each host past its settle
|
||
// window it dispatches a backup for every enabled schedule that is
|
||
// overdue. Skips hosts that bounced back offline, that are already
|
||
// running/queued a job, or that turned out to be always-on.
|
||
func (s *Server) RunCatchupsDue(ctx context.Context) {
|
||
if s.deps.Hub == nil {
|
||
return
|
||
}
|
||
now := time.Now().UTC()
|
||
for _, hostID := range s.dueCatchups(now) {
|
||
s.runCatchup(ctx, hostID, now)
|
||
}
|
||
}
|
||
|
||
// runCatchup evaluates and dispatches catch-up backups for a single
|
||
// host. Exported logic kept here so RunCatchupsDue reads cleanly.
|
||
func (s *Server) runCatchup(ctx context.Context, hostID string, now time.Time) {
|
||
conn := s.deps.Hub.Conn(hostID)
|
||
if conn == nil {
|
||
return // bounced offline during the settle window; re-arms on next hello
|
||
}
|
||
host, err := s.deps.Store.GetHost(ctx, hostID)
|
||
if err != nil {
|
||
slog.Warn("catchup: load host", "host_id", hostID, "err", err)
|
||
return
|
||
}
|
||
if host.AlwaysOn {
|
||
return // mode flipped during settle window
|
||
}
|
||
if host.CurrentJobID != nil {
|
||
return // a job is already running; don't pile on
|
||
}
|
||
schedules, err := s.deps.Store.ListSchedulesByHost(ctx, hostID)
|
||
if err != nil {
|
||
slog.Warn("catchup: list schedules", "host_id", hostID, "err", err)
|
||
return
|
||
}
|
||
for _, sc := range schedules {
|
||
if !sc.Enabled || len(sc.SourceGroupIDs) == 0 {
|
||
continue
|
||
}
|
||
if !scheduleOverdue(sc.CronExpr, host.LastBackupAt, now) {
|
||
continue
|
||
}
|
||
for _, gid := range sc.SourceGroupIDs {
|
||
g, err := s.deps.Store.GetSourceGroup(ctx, hostID, gid)
|
||
if err != nil {
|
||
slog.Warn("catchup: load source group",
|
||
"host_id", hostID, "schedule_id", sc.ID, "group_id", gid, "err", err)
|
||
continue
|
||
}
|
||
if _, derr := s.dispatchBackupForGroupCore(ctx, conn, hostID, sc.ID, g, now); derr != nil {
|
||
// Send failed — host dropped again. Re-arm so the next
|
||
// reconnect retries; stop processing this host.
|
||
s.ArmCatchup(hostID, now)
|
||
return
|
||
}
|
||
slog.Info("catchup: dispatched missed backup",
|
||
"host_id", hostID, "schedule_id", sc.ID, "group", g.Name)
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Arm catch-up on agent hello**
|
||
|
||
In `internal/server/http/host_credentials.go`, in `onAgentHello` (line 463), after the `go s.DrainPending(...)` line (485), add:
|
||
|
||
```go
|
||
// Intermittent hosts that just reconnected may have slept through a
|
||
// backup window. Arm a catch-up evaluation after a settle delay; the
|
||
// pending-drain tick fires it. Always-on hosts never need this.
|
||
if host, err := s.deps.Store.GetHost(ctx, hostID); err == nil && !host.AlwaysOn {
|
||
s.ArmCatchup(hostID, time.Now().UTC())
|
||
}
|
||
```
|
||
|
||
Verify `time` is already imported in this file (it is — used elsewhere). If not, add it.
|
||
|
||
- [ ] **Step 5: Fire catch-up from the pending-drain tick**
|
||
|
||
In `cmd/server/main.go`, in the `case <-pendingDrainTick.C:` block (line 228), change:
|
||
|
||
```go
|
||
case <-pendingDrainTick.C:
|
||
srv.DrainAllDue(ctx)
|
||
```
|
||
|
||
to:
|
||
|
||
```go
|
||
case <-pendingDrainTick.C:
|
||
srv.DrainAllDue(ctx)
|
||
srv.RunCatchupsDue(ctx)
|
||
```
|
||
|
||
- [ ] **Step 6: Build and vet**
|
||
|
||
Run: `go build ./... && go vet ./...`
|
||
Expected: clean build, no vet errors.
|
||
|
||
- [ ] **Step 7: Commit**
|
||
|
||
```bash
|
||
git add internal/server/http/server.go internal/server/http/catchup.go internal/server/http/host_credentials.go cmd/server/main.go
|
||
git commit -m "feat(catchup): arm on hello, fire missed-window backups on tick"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 4: Alert engine — suppress offline + staleness alert
|
||
|
||
**Files:**
|
||
- Modify: `internal/alert/engine.go:121-153` (handleJobFinished), `:155-174` (handleHostOffline), `:188-216` (tick)
|
||
- Modify: `internal/alert/rules.go:13-39` (constants), add exported resolve helper
|
||
- Test: `internal/alert/intermittent_test.go`
|
||
|
||
- [ ] **Step 1: Add the staleness threshold constant**
|
||
|
||
In `internal/alert/engine.go`, add near the top of the file (after imports, before `JobFinishedEvent`):
|
||
|
||
```go
|
||
// staleBackupThreshold is how long an intermittent host may go without
|
||
// a successful backup before we raise a stale_schedule alert. Global
|
||
// constant for v1 (may become per-host later). Only intermittent hosts
|
||
// are evaluated — always-on hosts' stale_schedule stays a no-op.
|
||
const staleBackupThreshold = 7 * 24 * time.Hour
|
||
```
|
||
|
||
- [ ] **Step 2: Suppress the offline alert for intermittent hosts**
|
||
|
||
In `handleHostOffline` (line 155), after loading the host and the existing `if host.LastSeenAt == nil { return }` guard, add a mode check. Change:
|
||
|
||
```go
|
||
if host.LastSeenAt == nil {
|
||
return
|
||
}
|
||
if time.Since(*host.LastSeenAt) < e.agentOfflineFloor {
|
||
return
|
||
}
|
||
```
|
||
|
||
to:
|
||
|
||
```go
|
||
// Intermittent hosts (laptops) legitimately disappear — never raise
|
||
// agent_offline for them. The stale_schedule sweep in tick() is the
|
||
// only staleness signal for these hosts.
|
||
if !host.AlwaysOn {
|
||
return
|
||
}
|
||
if host.LastSeenAt == nil {
|
||
return
|
||
}
|
||
if time.Since(*host.LastSeenAt) < e.agentOfflineFloor {
|
||
return
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Suppress offline + add staleness in the tick sweep**
|
||
|
||
In `tick` (line 188), the host loop currently raises agent_offline for every offline host. Replace the loop body (lines 205-214) with:
|
||
|
||
```go
|
||
for _, h := range hosts {
|
||
// Intermittent hosts: suppress agent_offline entirely; instead
|
||
// raise stale_schedule when they have gone too long with no
|
||
// successful backup AND they have at least one enabled schedule
|
||
// to be measured against. A nil LastBackupAt (never backed up)
|
||
// has no baseline — onboarding/repo_status covers that case.
|
||
if !h.AlwaysOn {
|
||
if h.LastBackupAt == nil {
|
||
continue
|
||
}
|
||
if now.Sub(*h.LastBackupAt) < staleBackupThreshold {
|
||
continue
|
||
}
|
||
hasEnabled, err := e.hostHasEnabledSchedule(ctx, h.ID)
|
||
if err != nil || !hasEnabled {
|
||
continue
|
||
}
|
||
e.raiseAndNotify(ctx, h.ID, KindStaleSchedule, "", "warning",
|
||
fmt.Sprintf("No backup in %s (threshold %s)",
|
||
roundDur(now.Sub(*h.LastBackupAt)), staleBackupThreshold), now)
|
||
continue
|
||
}
|
||
// Always-on hosts: existing agent_offline re-evaluation.
|
||
if h.Status != "offline" || h.LastSeenAt == nil {
|
||
continue
|
||
}
|
||
if now.Sub(*h.LastSeenAt) >= e.agentOfflineFloor {
|
||
e.raiseAndNotify(ctx, h.ID, KindAgentOffline, "", "warning",
|
||
fmt.Sprintf("Agent offline for %s (threshold %s)",
|
||
roundDur(now.Sub(*h.LastSeenAt)), e.agentOfflineFloor), now)
|
||
}
|
||
}
|
||
```
|
||
|
||
Delete the trailing `// Stale-schedule sweep — no-op in v1.` comment at line 215.
|
||
|
||
- [ ] **Step 4: Add the `hostHasEnabledSchedule` helper**
|
||
|
||
In `internal/alert/engine.go`, add at the end of the file:
|
||
|
||
```go
|
||
// hostHasEnabledSchedule reports whether the host has at least one
|
||
// enabled backup schedule — the precondition for a stale_schedule
|
||
// alert (no schedule = no backup expectation to measure against).
|
||
func (e *Engine) hostHasEnabledSchedule(ctx context.Context, hostID string) (bool, error) {
|
||
schedules, err := e.store.ListSchedulesByHost(ctx, hostID)
|
||
if err != nil {
|
||
return false, err
|
||
}
|
||
for _, sc := range schedules {
|
||
if sc.Enabled {
|
||
return true, nil
|
||
}
|
||
}
|
||
return false, nil
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Resolve staleness on a successful backup**
|
||
|
||
In `handleJobFinished` (line 146), the `case "succeeded":` currently resolves only the job-kind alert. For a successful backup, also clear any open stale_schedule. Change:
|
||
|
||
```go
|
||
case "succeeded":
|
||
e.resolveAndNotify(ctx, ev.HostID, kind, dedupKey, ev.When)
|
||
}
|
||
```
|
||
|
||
to:
|
||
|
||
```go
|
||
case "succeeded":
|
||
e.resolveAndNotify(ctx, ev.HostID, kind, dedupKey, ev.When)
|
||
if ev.Kind == "backup" {
|
||
// A fresh backup clears staleness for intermittent hosts.
|
||
e.resolveAndNotify(ctx, ev.HostID, KindStaleSchedule, "", ev.When)
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 6: Add an exported mode-change resolve hook**
|
||
|
||
The HTTP toggle handler (Task 5) needs to clear stale alerts when an operator changes a host's mode. Add to `internal/alert/rules.go` (after `Resolve`, around line 100):
|
||
|
||
```go
|
||
// ResolveOnModeChange clears any open agent_offline and stale_schedule
|
||
// alerts for a host whose always-on flag was just toggled. The next
|
||
// 60s tick re-raises whichever still applies under the new mode, so
|
||
// this is a self-correcting "wipe and let the sweep settle" call.
|
||
// Safe to invoke from the HTTP layer (it only touches the store + hub).
|
||
func (e *Engine) ResolveOnModeChange(ctx context.Context, hostID string, when time.Time) {
|
||
e.resolveAndNotify(ctx, hostID, KindAgentOffline, "", when)
|
||
e.resolveAndNotify(ctx, hostID, KindStaleSchedule, "", when)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 7: Write the engine tests**
|
||
|
||
Create `internal/alert/intermittent_test.go`. First inspect an existing engine test (e.g. grep `internal/alert/*_test.go` for how `NewEngine` is constructed with a test store + hub, and the helper that creates a host + schedule). Mirror those helpers. The tests to write:
|
||
|
||
```go
|
||
package alert
|
||
|
||
import (
|
||
"context"
|
||
"testing"
|
||
"time"
|
||
)
|
||
|
||
// Mirror the construction helpers used by the existing engine tests
|
||
// (newTestEngine / test store / host+schedule seeding). Replace the
|
||
// placeholder helpers below with the real ones from this package's
|
||
// existing _test.go files.
|
||
|
||
func TestIntermittentHostSuppressesOfflineAlert(t *testing.T) {
|
||
ctx := context.Background()
|
||
e, st := newTestEngine(t) // mirror existing helper
|
||
|
||
hostID := seedHost(t, st, false /* alwaysOn */)
|
||
// last seen well past the floor
|
||
touchHostSeen(t, st, hostID, time.Now().Add(-2*time.Hour))
|
||
markHostOffline(t, st, hostID)
|
||
|
||
e.handleHostOffline(ctx, hostID)
|
||
|
||
if n := openAlertCount(t, st, hostID, KindAgentOffline); n != 0 {
|
||
t.Fatalf("intermittent host should not raise agent_offline, got %d", n)
|
||
}
|
||
}
|
||
|
||
func TestAlwaysOnHostStillRaisesOfflineAlert(t *testing.T) {
|
||
ctx := context.Background()
|
||
e, st := newTestEngine(t)
|
||
|
||
hostID := seedHost(t, st, true /* alwaysOn */)
|
||
touchHostSeen(t, st, hostID, time.Now().Add(-2*time.Hour))
|
||
markHostOffline(t, st, hostID)
|
||
|
||
e.handleHostOffline(ctx, hostID)
|
||
|
||
if n := openAlertCount(t, st, hostID, KindAgentOffline); n != 1 {
|
||
t.Fatalf("always-on host should raise agent_offline, got %d", n)
|
||
}
|
||
}
|
||
|
||
func TestStalenessAlertForIntermittentHost(t *testing.T) {
|
||
ctx := context.Background()
|
||
e, st := newTestEngine(t)
|
||
|
||
hostID := seedHost(t, st, false)
|
||
seedEnabledSchedule(t, st, hostID) // "0 2 * * *" with a source group
|
||
setLastBackup(t, st, hostID, time.Now().Add(-8*24*time.Hour))
|
||
|
||
e.tick(ctx, time.Now().UTC())
|
||
|
||
if n := openAlertCount(t, st, hostID, KindStaleSchedule); n != 1 {
|
||
t.Fatalf("expected one stale_schedule alert, got %d", n)
|
||
}
|
||
|
||
// A successful backup clears it.
|
||
e.handleJobFinished(ctx, JobFinishedEvent{
|
||
HostID: hostID, JobID: "j1", Kind: "backup",
|
||
Status: "succeeded", When: time.Now().UTC(),
|
||
})
|
||
if n := openAlertCount(t, st, hostID, KindStaleSchedule); n != 0 {
|
||
t.Fatalf("stale_schedule should resolve after backup, got %d", n)
|
||
}
|
||
}
|
||
|
||
func TestNoStalenessWithoutEnabledSchedule(t *testing.T) {
|
||
ctx := context.Background()
|
||
e, st := newTestEngine(t)
|
||
|
||
hostID := seedHost(t, st, false)
|
||
setLastBackup(t, st, hostID, time.Now().Add(-8*24*time.Hour))
|
||
// no schedule seeded
|
||
|
||
e.tick(ctx, time.Now().UTC())
|
||
|
||
if n := openAlertCount(t, st, hostID, KindStaleSchedule); n != 0 {
|
||
t.Fatalf("no schedule => no staleness alert, got %d", n)
|
||
}
|
||
}
|
||
```
|
||
|
||
> **Note for the implementer:** the `newTestEngine`, `seedHost`, `touchHostSeen`, `markHostOffline`, `openAlertCount`, `seedEnabledSchedule`, `setLastBackup` helpers must be replaced with the real equivalents in this package's existing tests. If a needed seeding helper doesn't exist, write it using the `store` methods directly (`CreateHost`, `SetHostAlwaysOn`, `CreateSchedule`, `SetHostLastBackup`, `MarkHostsOfflineStale`, `ListAlerts`). Do NOT invent store methods — all required ones exist as of Task 1.
|
||
|
||
- [ ] **Step 8: Run the tests**
|
||
|
||
Run: `go test ./internal/alert/ -v`
|
||
Expected: PASS for all four new tests plus the existing suite.
|
||
|
||
- [ ] **Step 9: Commit**
|
||
|
||
```bash
|
||
go vet ./internal/alert/...
|
||
git add internal/alert/engine.go internal/alert/rules.go internal/alert/intermittent_test.go
|
||
git commit -m "feat(alert): suppress offline + add staleness alert for intermittent hosts"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 5: HTTP toggle handler + route
|
||
|
||
**Files:**
|
||
- Modify: `internal/server/http/ui_handlers.go` (new handler near `handleUIHostTagsSave` at line 954)
|
||
- Modify: `internal/server/http/server.go:281` (route mount)
|
||
|
||
- [ ] **Step 1: Add the handler**
|
||
|
||
In `internal/server/http/ui_handlers.go`, after `handleUIHostTagsSave` (line 984), add:
|
||
|
||
```go
|
||
// handleUIHostModeSave flips a host's always-on flag. Checkbox present
|
||
// in the form (value any) => always-on; absent => intermittent.
|
||
// Operator-band; mounted in server.go. On change we clear open
|
||
// offline/staleness alerts via the engine so the next sweep re-raises
|
||
// only what still applies under the new mode.
|
||
func (s *Server) handleUIHostModeSave(w stdhttp.ResponseWriter, r *stdhttp.Request) {
|
||
u := s.requireUIUser(w, r)
|
||
if u == nil {
|
||
return
|
||
}
|
||
hostID := chi.URLParam(r, "id")
|
||
if _, err := s.deps.Store.GetHost(r.Context(), hostID); err != nil {
|
||
stdhttp.NotFound(w, r)
|
||
return
|
||
}
|
||
if err := r.ParseForm(); err != nil {
|
||
stdhttp.Error(w, "bad request", stdhttp.StatusBadRequest)
|
||
return
|
||
}
|
||
alwaysOn := r.PostForm.Get("always_on") != ""
|
||
if err := s.deps.Store.SetHostAlwaysOn(r.Context(), hostID, alwaysOn); err != nil {
|
||
slog.Error("ui host mode: save", "host_id", hostID, "err", err)
|
||
stdhttp.Error(w, "internal", stdhttp.StatusInternalServerError)
|
||
return
|
||
}
|
||
if s.deps.AlertEngine != nil {
|
||
s.deps.AlertEngine.ResolveOnModeChange(r.Context(), hostID, time.Now().UTC())
|
||
}
|
||
_ = s.deps.Store.AppendAudit(r.Context(), store.AuditEntry{
|
||
ID: ulid.Make().String(), UserID: &u.ID, Actor: "user",
|
||
Action: "host.mode_updated",
|
||
TargetKind: ptr("host"), TargetID: &hostID,
|
||
TS: time.Now().UTC(),
|
||
})
|
||
stdhttp.Redirect(w, r, "/hosts/"+hostID, stdhttp.StatusSeeOther)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Mount the route**
|
||
|
||
In `internal/server/http/server.go`, next to the tags route (line 281):
|
||
|
||
```go
|
||
r.Post("/hosts/{id}/tags", s.handleUIHostTagsSave)
|
||
```
|
||
|
||
add directly below:
|
||
|
||
```go
|
||
r.Post("/hosts/{id}/mode", s.handleUIHostModeSave)
|
||
```
|
||
|
||
(Confirm it lands in the same operator-band route group as `/hosts/{id}/tags` — same indentation/block.)
|
||
|
||
- [ ] **Step 3: Build and vet**
|
||
|
||
Run: `go build ./... && go vet ./...`
|
||
Expected: clean.
|
||
|
||
- [ ] **Step 4: Write a handler test**
|
||
|
||
Add to the existing UI-handler test file (grep `internal/server/http/*_test.go` for the harness that builds a `Server` + does form POSTs against `/hosts/{id}/tags`; mirror it). The test posts to `/hosts/{id}/mode` with and without the `always_on` field and asserts the stored flag:
|
||
|
||
```go
|
||
func TestHandleUIHostModeSave(t *testing.T) {
|
||
srv, st, sess := newUITestServer(t) // mirror tags-save test harness
|
||
hostID := seedHostForUI(t, st) // mirror existing host seeding
|
||
|
||
// Uncheck: form without always_on => intermittent.
|
||
postForm(t, srv, sess, "/hosts/"+hostID+"/mode", map[string]string{})
|
||
if h, _ := st.GetHost(context.Background(), hostID); h.AlwaysOn {
|
||
t.Fatalf("expected always_on=false after empty post")
|
||
}
|
||
|
||
// Check: form with always_on=on => always-on.
|
||
postForm(t, srv, sess, "/hosts/"+hostID+"/mode", map[string]string{"always_on": "on"})
|
||
if h, _ := st.GetHost(context.Background(), hostID); !h.AlwaysOn {
|
||
t.Fatalf("expected always_on=true after checked post")
|
||
}
|
||
}
|
||
```
|
||
|
||
> Replace `newUITestServer`/`seedHostForUI`/`postForm` with the real harness helpers from the existing UI handler tests.
|
||
|
||
- [ ] **Step 5: Run the test**
|
||
|
||
Run: `go test ./internal/server/http/ -run TestHandleUIHostModeSave -v`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add internal/server/http/ui_handlers.go internal/server/http/server.go internal/server/http/*_test.go
|
||
git commit -m "feat(http): host mode toggle handler + route (host.mode_updated)"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 6: UI — asleep state, 24×7 chip, mode toggle
|
||
|
||
**Files:**
|
||
- Modify: `web/styles/input.css` (dot-asleep token)
|
||
- Modify: `web/templates/partials/host_row.html`
|
||
- Modify: `web/templates/partials/host_chrome.html`
|
||
|
||
- [ ] **Step 1: Add the `dot-asleep` CSS token**
|
||
|
||
In `web/styles/input.css`, find the `.dot-offline` definition (grep for `dot-offline`) and add a sibling `.dot-asleep` rule. Match the existing dot pattern; use a calm grey-blue distinct from offline's grey/red. Example (adapt colours to the file's existing tokens):
|
||
|
||
```css
|
||
.dot-asleep { background: var(--ink-fade); opacity: 0.6; }
|
||
```
|
||
|
||
> Inspect the neighbouring `.dot-offline` / `.dot-degraded` rules first and follow their exact shape (size, border, etc.); only the colour/opacity should differ.
|
||
|
||
- [ ] **Step 2: Rebuild CSS if the project precompiles it**
|
||
|
||
Check the Makefile for a CSS build step (grep `css` in `Makefile`). If present, run it (e.g. `make css`). If the server serves `input.css` directly, skip.
|
||
|
||
- [ ] **Step 3: Asleep dot + text in host_row.html**
|
||
|
||
In `web/templates/partials/host_row.html`, change the status-dot block (lines 6-14). Replace the `{{- else if eq $h.Status "offline" -}}` dot branch:
|
||
|
||
```html
|
||
{{- else if eq $h.Status "offline" -}}
|
||
<span class="dot dot-offline"></span>
|
||
```
|
||
|
||
with:
|
||
|
||
```html
|
||
{{- else if eq $h.Status "offline" -}}
|
||
{{- if $h.AlwaysOn -}}
|
||
<span class="dot dot-offline"></span>
|
||
{{- else -}}
|
||
<span class="dot dot-asleep"></span>
|
||
{{- end -}}
|
||
```
|
||
|
||
Then change the last-seen text branch (lines 28-29):
|
||
|
||
```html
|
||
{{- else if eq $h.Status "offline" -}}
|
||
<span class="text-ink-mute">last seen <span class="mono">{{relTime $h.LastSeenAt}}</span></span>
|
||
```
|
||
|
||
to:
|
||
|
||
```html
|
||
{{- else if eq $h.Status "offline" -}}
|
||
{{- if $h.AlwaysOn -}}
|
||
<span class="text-ink-mute">last seen <span class="mono">{{relTime $h.LastSeenAt}}</span></span>
|
||
{{- else -}}
|
||
<span class="text-ink-mute">asleep · <span class="mono">{{relTime $h.LastSeenAt}}</span> · will catch up on return</span>
|
||
{{- end -}}
|
||
```
|
||
|
||
And the row-action label (lines 55-56):
|
||
|
||
```html
|
||
{{- if eq $h.Status "offline" -}}
|
||
<span class="mono text-xs text-ink-fade">offline</span>
|
||
```
|
||
|
||
to:
|
||
|
||
```html
|
||
{{- if eq $h.Status "offline" -}}
|
||
<span class="mono text-xs text-ink-fade">{{if $h.AlwaysOn}}offline{{else}}asleep{{end}}</span>
|
||
```
|
||
|
||
- [ ] **Step 4: Asleep dot + last-seen in host_chrome.html**
|
||
|
||
In `web/templates/partials/host_chrome.html`, change the offline dot branch (lines 36-37):
|
||
|
||
```html
|
||
{{else if eq $host.Status "offline"}}
|
||
<span class="dot dot-offline"></span>
|
||
```
|
||
|
||
to:
|
||
|
||
```html
|
||
{{else if eq $host.Status "offline"}}
|
||
{{if $host.AlwaysOn}}
|
||
<span class="dot dot-offline"></span>
|
||
{{else}}
|
||
<span class="dot dot-asleep"></span>
|
||
{{end}}
|
||
```
|
||
|
||
And the last-seen line (lines 90-94):
|
||
|
||
```html
|
||
{{if eq $host.Status "offline"}}
|
||
<span>last seen <span class="mono text-ink-mid">{{relTime $host.LastSeenAt}}</span></span>
|
||
{{else}}
|
||
<span>online · last heartbeat <span class="mono text-ink-mid">{{relTime $host.LastSeenAt}}</span></span>
|
||
{{end}}
|
||
```
|
||
|
||
to:
|
||
|
||
```html
|
||
{{if eq $host.Status "offline"}}
|
||
{{if $host.AlwaysOn}}
|
||
<span>last seen <span class="mono text-ink-mid">{{relTime $host.LastSeenAt}}</span></span>
|
||
{{else}}
|
||
<span>asleep · last seen <span class="mono text-ink-mid">{{relTime $host.LastSeenAt}}</span> · will catch up on return</span>
|
||
{{end}}
|
||
{{else}}
|
||
<span>online · last heartbeat <span class="mono text-ink-mid">{{relTime $host.LastSeenAt}}</span></span>
|
||
{{end}}
|
||
```
|
||
|
||
- [ ] **Step 5: Add the 24×7 chip + mode toggle to host_chrome.html**
|
||
|
||
In the header tags block (lines 42-48), after the tags `edit/add tags` button and before the closing `</div>` at line 48, add the chip (shown only when always-on) and a small toggle button mirroring the tags-editor reveal pattern:
|
||
|
||
```html
|
||
{{if $host.AlwaysOn}}<span class="tag" title="Expected online 24×7 — offline raises an alert">24×7</span>{{end}}
|
||
<button type="button" class="text-ink-fade text-[11px] hover:text-ink-mid whitespace-nowrap"
|
||
style="padding: 2px 8px; border: 1px dashed var(--line); border-radius: 3px; cursor: pointer;"
|
||
onclick="document.getElementById('mode-edit-{{$host.ID}}').classList.toggle('hidden')"
|
||
title="Change presence mode">presence</button>
|
||
```
|
||
|
||
Then add the toggle form right after the tags `<form>` block (after line 82, before the `<div class="flex items-center gap-3 mt-3 ...">` at line 83):
|
||
|
||
```html
|
||
{{/* Presence-mode editor — hidden by default; toggled by the
|
||
"presence" button. Checkbox present => always-on (24×7);
|
||
unchecked => intermittent (laptop): no offline alerts, shows
|
||
"asleep", auto-catches-up a missed backup on reconnect. */}}
|
||
<form id="mode-edit-{{$host.ID}}" method="post"
|
||
action="/hosts/{{$host.ID}}/mode"
|
||
class="hidden mt-3" style="max-width: 640px;">
|
||
<label class="flex items-center gap-2 text-[12px] text-ink-mid">
|
||
<input type="checkbox" name="always_on" value="on" {{if $host.AlwaysOn}}checked{{end}} />
|
||
Always On — expected online 24×7
|
||
</label>
|
||
<div class="field-help">
|
||
Uncheck for an intermittent host (laptop/workstation): it won’t
|
||
raise offline alerts when asleep, shows an “asleep” state, and
|
||
catches up a missed backup ~1 minute after it reconnects.
|
||
</div>
|
||
<button type="submit" class="btn btn-primary mt-2 whitespace-nowrap">Save presence</button>
|
||
</form>
|
||
```
|
||
|
||
- [ ] **Step 6: Verify templates parse**
|
||
|
||
Run: `go build ./... && go test ./internal/server/... -run Template -v` (if a template-render test exists; otherwise rely on the smoke run in Step 7). At minimum: `go build ./...` must pass.
|
||
|
||
- [ ] **Step 7: Manual smoke (per CLAUDE.md smoke targets)**
|
||
|
||
```bash
|
||
make smoke-deploy
|
||
```
|
||
|
||
Then in a browser (or Playwright): open the dashboard and a host detail page. Toggle a host to intermittent via the "presence" control, confirm the 24×7 chip disappears, and confirm an offline/sleeping intermittent host renders the grey "asleep · … · will catch up on return" line instead of red "offline". Toggle back and confirm the chip returns.
|
||
|
||
- [ ] **Step 8: Commit**
|
||
|
||
```bash
|
||
git add web/styles/input.css web/templates/partials/host_row.html web/templates/partials/host_chrome.html
|
||
git commit -m "feat(ui): asleep state, 24×7 chip, presence toggle for host mode"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 7: Record in tasks.md + final verification
|
||
|
||
**Files:**
|
||
- Modify: `tasks.md`
|
||
|
||
- [ ] **Step 1: Add a tasks.md entry**
|
||
|
||
Add a `[x]` entry under "Next steps from testing" in `tasks.md` (mirroring the NS-07 style — one line + a short "As shipped" note) describing the always-on/intermittent host mode: `always_on` column (default on), offline-alert suppression + 7-day staleness alert for intermittent hosts, settle-then-catch-up on reconnect, and the asleep UI + 24×7 chip + presence toggle.
|
||
|
||
- [ ] **Step 2: Full verification**
|
||
|
||
```bash
|
||
go vet ./...
|
||
go test ./...
|
||
```
|
||
|
||
Expected: vet clean, all tests green.
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tasks.md
|
||
git commit -m "docs(tasks): record always-on/intermittent host mode"
|
||
```
|
||
|
||
---
|
||
|
||
## Self-Review notes
|
||
|
||
- **Spec coverage:** §1 data model → Task 1. §2 mechanics unchanged → no task needed (verified untouched). §3 alerts (suppress offline, staleness, resolve-on-backup, resolve-on-toggle) → Task 4 + Task 5 Step 1. §4 catch-up (arm on hello, settle, per-schedule overdue, dispatch, guards) → Tasks 2-3. §5 UI (dot-asleep, asleep text, 24×7 chip, toggle) → Task 6. Testing → tests in Tasks 1-5. Out-of-scope items respected (global 7d const, reconnect-only, no agent-side cron, always-on stale_schedule untouched).
|
||
- **Type consistency:** `scheduleOverdue(cronExpr string, *time.Time, time.Time) bool`, `ArmCatchup(hostID string, now time.Time)`, `RunCatchupsDue(ctx)`, `SetHostAlwaysOn(ctx, hostID, bool)`, `ResolveOnModeChange(ctx, hostID, when)`, `Host.AlwaysOn bool` — used consistently across tasks.
|
||
- **No invented store methods:** all `store.*` calls (GetHost, ListSchedulesByHost, GetSourceGroup, SetHostLastBackup, ListAlerts, AppendAudit, dispatchBackupForGroupCore, Hub.Conn/Connected) exist in the current tree; `SetHostAlwaysOn` is the only new one and is defined in Task 1.
|
||
- **Test helper caveat:** the alert and HTTP handler tests reference package-local helpers (`newTestEngine`, `newUITestServer`, etc.) that must be matched to the real names in existing `_test.go` files at implementation time — flagged inline in each task.
|