Commit Graph

314 Commits

Author SHA1 Message Date
steve 001575ae9c tasks: P6-03 done, repo size trend graphs 2026-05-07 19:20:05 +01:00
steve 28cc55711d test: assert Trend panel renders on full repo page 2026-05-07 19:14:34 +01:00
steve 98cc490ea8 ui: trend panel + range selector on host repo page 2026-05-07 19:10:59 +01:00
steve be4ac02ddd ui: 30d repo-size sparkline on every dashboard host row 2026-05-07 19:02:35 +01:00
steve 6e8a1c5b45 web/sparkline: guard days[i] against shorter days slice in RenderChart 2026-05-07 18:58:33 +01:00
steve e7d25cd704 web/sparkline: two-axis trend chart with hover dots 2026-05-07 18:55:31 +01:00
steve db88c5a7d1 web/sparkline: inline-SVG sparkline renderer (empty / single / multi) 2026-05-07 18:50:23 +01:00
steve bb2a88be24 ws: record daily repo stats history alongside current upsert 2026-05-07 18:46:26 +01:00
steve b9c7ec6ebf store: history table helpers (upsert/list, COALESCE preserves prior values) 2026-05-07 18:43:20 +01:00
steve da518de3e6 store: migration 0023 host_repo_stats_history 2026-05-07 18:39:44 +01:00
steve 55453300b0 Merge pull request 'tidy: project finished backup jobs onto host row + smoke doc tweaks' (#20) from tidy-up-last-backup-projection into main
Reviewed-on: #20
2026-05-07 16:58:16 +00:00
steve 0a75b82c17 fix: project finished backup jobs onto host row + smoke path tweaks
The dashboard's 'Last backup' column reads hosts.last_backup_at /
last_backup_status, but the WS handler only updated hosts.repo_status
on job.finished — backup terminations were silently dropped. Add a
SetHostLastBackup store method and call it from the same job.finished
switch that already handles init jobs.

Also: CLAUDE.md restage block uses /tmp/rm-smoke (the original
default) but the actual dev env runs out of $HOME/smoke. Update the
paths in the doc to match.
2026-05-07 17:55:23 +01:00
steve b60c2c6f6b Merge pull request 'P6-01 + P6-02: agent self-update + fleet update' (#19) from p6-agent-self-update into main
Reviewed-on: #19
2026-05-07 16:49:25 +00:00
steve 1909f71f90 tasks: mark P6-01 + P6-02 done with as-shipped block 2026-05-06 22:33:33 +01:00
steve dddff10b99 agent unit: allow writes to /usr/local/bin for self-update
Smoke caught this: ProtectSystem=full mounts /usr read-only so the
agent couldn't write its own .new staging file or atomic-rename over
the running binary. Adding /usr/local/bin to ReadWritePaths is the
minimum diff that lets self-update work; the whole-dir grant is
required because os.Rename needs write on the parent directory.
2026-05-06 22:32:50 +01:00
steve 39304b08d0 ui: dashboard hosts-behind tile + filter
- Add ?updates=behind query filter and the matching dashboardFilter
  field; round-trips through encode/parse.
- Compute UpdatesBehind on the dashboard view-model (online + version
  trailing the server) and surface as an amber hero tile that links
  to the filtered list.
- Test exercise covering the new filter case.
2026-05-06 22:20:54 +01:00
steve 9bcd8bc5fe ui: update chip + per-host button
- Surface UpdateAvailable + TargetVersion on the dashboard host row,
  the host_chrome header, and the JSON Host shape.
- New host_update_chip partial renders an amber out-of-date pill
  next to the agent-version display when the host's agent trails
  the server.
- Host detail right-rail gains an admin-only Update agent button
  (disabled when host is offline or already updating).
- New .update-chip and .btn-amber CSS tokens; tailwind output
  refreshed.
2026-05-06 22:20:40 +01:00
steve e6cfb1cd9f ui: fleet update page + endpoints
- POST /api/fleet/update, POST /api/fleet-updates/{id}/cancel,
  GET /api/fleet-updates/{id} (admin-only).
- GET /settings/fleet-update + /partial for htmx polling.
- Renders idle / running / terminal states with per-host progress.
- Tests cover happy path, derive-host-ids, conflict, cancel, get,
  and RBAC.
2026-05-06 22:20:03 +01:00
steve 9d5775fb47 p6-01/02: agent self-update + fleet update server cluster
- alert: update_failed (per-host, dedup=hostID) + fleet_update_halted
  (system-scoped, host_id NULL via new RaiseOrTouchSystem helper).
- ws: UpdateWatcher tracks in-flight command.update dispatches and
  reconciles them against incoming hello envelopes — success path
  marks the job succeeded and auto-resolves the alert; 90s timeout
  marks the job failed and raises update_failed.
- http: POST /api/hosts/{id}/update (admin-only JSON) + the HTMX
  /hosts/{id}/update form variant. Pre-checks: host exists, online,
  agent_version != current, no running update job. Refactored core
  into Server.dispatchHostUpdate so the fleet worker can share it
  without going through HTTP.
- fleetupdate: rolling worker iterating through host slots, halting
  on first failure and raising fleet_update_halted. Polling-based
  version-match (re-read hosts.agent_version every 1s up to 95s) —
  no extra plumbing into the WS hello path. At-most-one-running is
  enforced at the store layer (ErrFleetUpdateRunning).
- cmd/server: wire UpdateWatcher and FleetWorker into the main
  goroutine; the worker uses a small serverDispatcher adapter that
  delegates back into Server.DispatchHostUpdate.

Tests: watcher (success/timeout/mismatch/late-hello), HTTP endpoint
(happy + four pre-check branches + RBAC), worker (two-host happy,
timeout-halt, host-offline-halt, already-at-target skip, cancel
mid-run, double-Start guard).
2026-05-06 22:03:50 +01:00
steve c37954aa3f store: migrations 0021+0022 + fleet_updates CRUD 2026-05-06 21:47:54 +01:00
steve efed96f67a agent: command.update handler + updater package (Linux + Windows) 2026-05-06 21:42:50 +01:00
steve f31f6edde7 http: expose GET /api/version 2026-05-06 21:39:13 +01:00
steve 516c50fa16 version: build-time version package + Makefile ldflags wiring 2026-05-06 21:38:35 +01:00
steve a8256f5aff tasks: rewrite P6-01/02 around server-bundled agent self-update
The original plan was apt repo + Chocolatey package. The P5-03 Docker
pivot bundled matching agent binaries into the server image and
exposes them via /agent/binary, so 'update agent' now collapses to
're-fetch from your own server'. No third-party packaging or signing
infra needed. P6-01 drops to S; P6-02 keeps the dashboard reporting
+ fleet-update UX but points at the new mechanism.
2026-05-06 21:08:22 +01:00
steve ab7fee0ae7 ci(release): use DEV_TOKEN for registry login
Release / Build + push image (push) Successful in 3m57s
The auto-issued GITHUB_TOKEN lacks write:package scope on this Gitea
instance, so the v0.9.0 tag build failed at docker login. Switch to
the user-level DEV_TOKEN secret which has the correct scope.
v0.9.0
2026-05-06 19:05:54 +01:00
steve ed276813f0 Merge pull request 'testing: bootstrap UI, agent reliability, NS-01..04 + alert username' (#18) from ns-batch-host-ops into main 2026-05-05 21:09:17 +00:00
steve 02e4ef7544 testing: bootstrap UI, agent reliability, NS-01..04 + alert username
Smoothes the rough edges that came up exercising a live deployment.

First-run bootstrap UI: /bootstrap renders a username + password form
that uses the in-memory token directly (operator no longer copies it
out of the log); /login redirects there while bootstrap is available.

Agent reliability: failJob synthetic envelopes so command.run early
returns no longer hang the server-side job; runtime probe of restic
restore --help drives --no-ownership instead of version sniffing
(0.18.x had it removed). Server unit re-shaped: ProtectSystem=full
plus ReadWritePaths=/etc/restic-manager, no ProtectHome — restore
can now write anywhere a user might want.

Restore wizard: default target is /root/rm-restore/<job-id>/ with
clearer help text. Re-init confirm input uses .field (was .input,
which doesn't exist — text was invisible).

NS-01 host delete: store DeleteHost, admin-band /hosts/{id}/delete
with hostname-confirm danger zone, audit, FK cascade, live WS close.

NS-02 enrollment-token recovery: outstanding-tokens panel on
/hosts/new, regenerate (preserves attachments) and revoke handlers
+ audit, store-level ListOutstandingEnrollmentTokens and
DeleteEnrollmentToken.

NS-03 repo init / probe surface: migration 0020 adds
hosts.repo_status + repo_status_error; WS handler projects every
init job's outcome onto the host row (idempotent already-initialised
collapses to ready); creds-save resets status and dispatches a fresh
probe; /hosts/{id}/repo/probe retry endpoint with banner.

NS-04 dashboard live + sort + filter: query-string filter
(q/status/repo_status/tag/sort/dir), 5s htmx live poll mirroring the
alerts pattern with a localStorage live toggle, sortable column
headers, filter row + clear.

Alerts page: ack'd-by line resolves user_id ULID to username.

Compose.yaml ignored — host-specific.
2026-05-05 22:03:15 +01:00
steve ddb46e16b6 Merge pull request 'P5-03 + P5-07: docker-only release path & reference deployment' (#17) from p5-03-docker-release into main
Reviewed-on: #17
2026-05-05 16:36:08 +00:00
steve e8913943f9 p5-07: reference deployment (server-only compose + reverse-proxy docs)
The reverse proxy is assumed to live outside this project (Caddy,
nginx, Traefik, whatever the operator already runs). The reference
compose stands up only the server: image-pinned via RM_VERSION,
named volume for operator state, localhost-bound so the proxy
reaches it on loopback.

docs/reverse-proxy.md covers what the proxy must forward — the
X-Forwarded-* headers, Host, and Connection: upgrade for the agent
WebSocket and live-log streams — plus the RM_TRUSTED_PROXY CIDR
rule that gates header trust. Worked examples for Caddy, nginx
(with the websocket upgrade map + 1h proxy_read_timeout for live
logs), and Traefik.
2026-05-05 17:15:00 +01:00
steve fb978ad10c p5-03: docker-only release path (drop goreleaser)
Single public deliverable per tag: a multi-arch server image, with
cross-compiled agent binaries + install scripts + the systemd unit
baked under /opt/restic-manager/dist/. The /agent/binary and
/install/* handlers fall back from <DataDir>/... to that read-only
path so a fresh container Just Works without first-run staging;
operators can still drop a custom build into <DataDir>/ to override
per-host.

Architecture rationale: agent distribution already routes through
the running server, so the release surface mirrors that — there's
no second source of truth to keep in sync.

Workflow .gitea/workflows/release.yml triggers on v*.*.* tag-push
(fan-out :vX.Y.Z / :X.Y / :X, plus :latest once MAJOR>=1) and
workflow_dispatch (snapshot tag only). Pushes to the Gitea
container registry on this instance.

Both binaries grow main.commit + main.date ldflag targets. Makefile
and Dockerfile fill them; release workflow forwards from gitea.sha
plus a UTC timestamp.

Spec : docs/superpowers/specs/2026-05-05-p5-03-docker-only-release.md
Plan : docs/superpowers/plans/2026-05-05-p5-03-docker-only-release.md
2026-05-05 15:18:48 +01:00
steve 9abdedf40a Merge pull request 'P4-05: OIDC login (generic, JIT-provisioned)' (#16) from p4-05-oidc into main
Reviewed-on: #16
2026-05-05 13:46:23 +00:00
steve 2e1961beee oidc: merge userinfo claims; tick P4-05 in tasks.md
Authelia (and many other IdPs) only put `sub` in the ID token by
default, surfacing `preferred_username`/`email`/`groups` from the
userinfo endpoint. Fetch userinfo after id_token verification and
fold its claims into the parsed claim map; the id_token claims
remain authoritative on conflict so the signed assertion still
wins.

Live sweep against https://auth.dcglab.co.uk verified all four
flows: rm-admin → admin JIT, rm-operator → operator JIT (RBAC
denies admin pages), rm-viewer → viewer JIT (RBAC denies operator
pages), rm-other → no_role_match banner with no row created.
Returning rm-admin sign-in resolves to the same row by sub.
Screenshots in _diag/p4-05-sweep/.
2026-05-05 14:06:28 +01:00
steve e0989e1cef server: build OIDC client at startup; sweep oidc_state on alert tick 2026-05-05 13:45:52 +01:00
steve fce7245a51 ui(users): oidc chip on list + readonly fields on edit for OIDC users 2026-05-05 13:42:57 +01:00
steve 5154b24fab ui: login page — SSO button + oidc_error banner 2026-05-05 13:40:13 +01:00
steve 1cf9cb752f http: local-login rejects auth_source='oidc' users 2026-05-05 13:37:07 +01:00
steve d2ffc98f3c http: logout — 303 to end_session_endpoint with id_token_hint for OIDC sessions 2026-05-05 13:34:47 +01:00
steve 1fd9dce8a2 http: GET /auth/oidc/callback — JIT-provision, refresh, deny paths 2026-05-05 13:30:00 +01:00
steve 746324e65a http: GET /auth/oidc/login — generate state/PKCE, redirect to IdP 2026-05-05 13:26:06 +01:00
steve ede014e85b oidc: test stub IdP + happy-path exchange test 2026-05-05 13:23:16 +01:00
steve 4594e563ef oidc: client wrapper around go-oidc — discovery, exchange, claim parse 2026-05-05 13:20:08 +01:00
steve db2fcdd52e config: OIDCConfig — YAML + env overlay with defaults 2026-05-05 13:18:01 +01:00
steve e2976a42e6 store: oidc_state CRUD + 5-minute cleanup 2026-05-05 13:15:45 +01:00
steve 14be63510c store: round-trip IDToken on sessions for RP-initiated logout 2026-05-05 13:14:27 +01:00
steve 70aa22e87e store: GetUserByOIDCSubject + scanUser auth_source/oidc_subject 2026-05-05 13:12:11 +01:00
steve 154b57a4cd store: extend User with AuthSource/OIDCSubject; Session with IDToken 2026-05-05 13:09:49 +01:00
steve c5b29b88b9 store: migration 0019 — users.auth_source/oidc_subject + sessions.id_token + oidc_state 2026-05-05 13:08:15 +01:00
steve 1df072a211 Merge pull request 'Phase 4 — P4-07: per-host tags + dashboard chip-row filter' (#15) from p4-07-host-tags into main
Reviewed-on: #15
2026-05-05 10:55:11 +00:00
steve 2421d5d389 ui(tags): edit-button label, Save-tags width, persistent help text 2026-05-05 11:23:36 +01:00
steve 168059ae45 feat(hosts): per-host tags edit + dashboard chip-row filter (P4-07) 2026-05-05 11:16:09 +01:00