From 20425b336092ff8b660dfb3cdd23050e9daae224 Mon Sep 17 00:00:00 2001 From: Steve Cliff Date: Thu, 7 May 2026 18:09:25 +0100 Subject: [PATCH] spec: P6-03 repo size trend (sparkline + chart) design --- ...2026-05-07-p6-03-repo-size-trend-design.md | 223 ++++++++++++++++++ 1 file changed, 223 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-07-p6-03-repo-size-trend-design.md diff --git a/docs/superpowers/specs/2026-05-07-p6-03-repo-size-trend-design.md b/docs/superpowers/specs/2026-05-07-p6-03-repo-size-trend-design.md new file mode 100644 index 0000000..d140679 --- /dev/null +++ b/docs/superpowers/specs/2026-05-07-p6-03-repo-size-trend-design.md @@ -0,0 +1,223 @@ +# P6-03 — Repo size trend graphs + +Sparkline on the dashboard host row + full chart on the host repo +page, both showing repo growth over time. Closes the last +operator-visibility gap in Phase 6 alongside Prometheus metrics +(P6-04). + +## Goals + +- Operators can see at a glance whether a host's repo is growing, + stable, or shrinking, without leaving the dashboard. +- A second screen on the repo page exposes the same data over a + longer window with a snapshot-count overlay so retention + behaviour can be eyeballed against size. +- Zero new client-side dependencies; matches the existing + HTMX + server-rendered idiom used everywhere else in the UI. + +## Non-goals + +- No backfill of historical data. Trend lights up with whatever + the agents report from the day this ships. +- No per-source-group breakdown — repo-level only. +- No alerting on growth rate (dedicated to a future ticket if a + user asks). +- No JSON API surface. Prometheus exposure is P6-04, separate. + +## Decisions taken in brainstorming + +- **Metrics:** `total_size_bytes` (sparkline + chart) and + `snapshot_count` (chart only). Raw size dropped as redundant. +- **Cadence:** one row per `(host_id, UTC date)`, last-write-wins + per column. Bounded at ~365 rows/host/year regardless of job + frequency. +- **Backfill:** none. Pure forward-fill from launch day. +- **Rendering:** server-rendered inline SVG, no JS library. +- **Spans:** sparkline fixed at 30 days; chart has `30d | 90d | 1y` + range selector, server-rendered swap. + +## Schema + +New migration `internal/store/migrations/0023_host_repo_stats_history.sql`: + +```sql +CREATE TABLE host_repo_stats_history ( + host_id TEXT NOT NULL REFERENCES hosts(id) ON DELETE CASCADE, + day TEXT NOT NULL, -- 'YYYY-MM-DD' UTC + total_size_bytes INTEGER, -- nullable; partial patches don't overwrite + snapshot_count INTEGER, -- nullable + recorded_at TEXT NOT NULL, -- RFC3339Nano of last write touching this row + PRIMARY KEY (host_id, day) +); +CREATE INDEX host_repo_stats_history_host_day + ON host_repo_stats_history(host_id, day DESC); +``` + +FK cascade matches every other host-scoped table; deleting a host +through `Store.DeleteHost` (NS-01) wipes its history automatically. + +## Write path + +Hook the existing `MsgRepoStats` handler in +`internal/server/ws/handler.go` (around line 319). After the +existing `UpsertHostRepoStats(ctx, hostID, patch)` call, append: + +```go +day := time.Now().UTC().Format("2006-01-02") +if err := deps.Store.UpsertHostRepoStatsHistory(ctx, hostID, day, patch); err != nil { + slog.Warn("ws: upsert host repo stats history", "host_id", hostID, "err", err) +} +``` + +A history-write failure is logged and dropped — never blocks the +main upsert. The partial-update contract that +`UpsertHostRepoStats` already implements is preserved at the +history layer: + +```sql +INSERT INTO host_repo_stats_history (host_id, day, total_size_bytes, snapshot_count, recorded_at) +VALUES (?, ?, ?, ?, ?) +ON CONFLICT(host_id, day) DO UPDATE SET + total_size_bytes = COALESCE(excluded.total_size_bytes, host_repo_stats_history.total_size_bytes), + snapshot_count = COALESCE(excluded.snapshot_count, host_repo_stats_history.snapshot_count), + recorded_at = excluded.recorded_at; +``` + +This is critical: the agent's prune handler in +`internal/agent/runner/runner.go:318` emits a stats patch that +only carries `LastPruneAt`. Without `COALESCE`, that prune ack +would null out a `total_size_bytes` we'd already captured from a +backup earlier the same day. + +## Read path + +Two new helpers in `internal/store/host_repo_stats_history.go`: + +```go +type RepoStatsHistoryPoint struct { + Day time.Time // 00:00:00 UTC + TotalSizeBytes *int64 + SnapshotCount *int64 +} + +func (s *Store) ListHostRepoStatsHistory( + ctx context.Context, hostID string, since time.Time, +) ([]RepoStatsHistoryPoint, error) +``` + +Returns rows ordered by `day` ascending where at least one metric +is non-null. The renderer connects available points with a +straight line — there is no explicit gap representation. A host +that was offline for a week shows a single segment spanning the +gap, which is the right visual: the repo state didn't change. + +## Rendering + +New package `internal/web/sparkline`. Pure Go, no template +dependency: + +```go +type Series struct { + Name string + Points []float64 // nil-points represented as math.NaN + Stroke string // CSS color +} + +func RenderSparkline(points []float64, width, height int) template.HTML +func RenderChart(series []Series, days []time.Time, opts ChartOpts) template.HTML +``` + +`RenderChart` produces a 600×220 SVG with: + +- Light horizontal gridlines (4 bands). +- Two y-axes: bytes (left, blue) and count (right, amber). Each + series is normalised against its own axis. +- X-axis labels at start, midpoint, and end of the window. +- Per-point `` with a `` for hover tooltips — + accessible by default, no JS. +- Empty state: faint dashed baseline + centered "no data yet" + text. + +Sparkline is 80×20, single blue polyline, single `<title>` on the +group element showing `"current → 30d ago"`. + +Two new partials: + +- `web/templates/partials/repo_size_sparkline.html` +- `web/templates/partials/repo_size_chart.html` + +Both call into the renderer with the appropriate opts. No +inline `<style>` — colours come from existing Tailwind palette +classes already used elsewhere (`text-blue-500`, `text-amber-500`). + +## UI placement + +### Dashboard host row + +`web/templates/partials/host_row.html` gains one `<td>` between +the existing "Repo size" cell and "Snapshots" cell. Width ≈ 88px. +Cell renders the sparkline partial; if `len(points) < 2` the cell +shows "—" centred (matches the existing no-data idiom for +last-backup time in the same partial). + +The dashboard's existing 5-second htmx live-refresh +(`hx-trigger="every 5s ..."` from NS-04) re-renders this cell +along with the rest of the row. No extra polling. + +### Host repo page + +`web/templates/pages/host_repo.html` gains a "Trend" panel +inserted between the existing summary panel and the maintenance +panel. Panel contains: + +- Range pills `30d | 90d | 1y` (anchor links with + `hx-get="/hosts/{id}/repo/trend?range=…"` and + `hx-target="#repo-trend-chart" hx-swap="outerHTML"`). +- The chart partial wrapped in `<div id="repo-trend-chart">`. +- A small legend strip below the chart. + +## Endpoints + +- `GET /hosts/{id}/repo/trend?range=30d|90d|1y` — admin/operator, + htmx fragment, returns the chart partial. Auth reuses the + existing host-scoped middleware on the `/hosts/{id}` family. + Invalid `range` falls back to 30d. + +No new admin-only surface — anyone with read access to the host +can see the trend. + +## Testing + +- `internal/store/host_repo_stats_history_test.go` — upsert + merges partial patches without nulling; ordering; since-day + filter; cascade on host delete. +- `internal/web/sparkline/sparkline_test.go` — golden SVG files + for: empty input, single point, full 30-day series, mixed + null points. Goldens live under `testdata/`. +- `internal/server/http/ui_repo_test.go` — trend panel renders + with seeded history; range selector swaps server-side; empty + state. +- `internal/server/http/ui_dashboard_test.go` — host row sparkline + cell present and renders SVG when points exist, "—" when not. +- Smoke after build: dashboard row shows sparkline once two days + of data exist; repo page chart toggles cleanly between ranges. + +## Migration / rollout + +- Schema migration is additive — no risk to existing tables. +- Write path is best-effort; on schema issue the main repo-stats + upsert is unaffected. +- No agent change required, so no fleet update needed. + +## Acceptance + +- After two days of operation, the dashboard sparkline shows a + visible line for any host that has run a backup or + maintenance op on both days. +- Host repo page renders the trend panel with the snapshot-count + overlay; range selector switches view without a full page + reload. +- `go test ./...` and `go vet ./...` clean. +- Smoke env exercise: backup → sparkline updates; range pills + swap; FK cascade verified by deleting a host and checking the + history table.