P2 completion (P2R-09/10/11/12/13/14, P2-16/17/18) #5

Merged
steve merged 16 commits from p2-completion into main 2026-05-04 14:19:06 +01:00
Owner

Summary

Closes every remaining P2 item in tasks.md. 15 commits on this branch; full test suite green under -race.

  • P2R-09 auto-init UX — init status line in vitals strip, danger-zone re-init with typed-hostname confirm
  • P2R-10/11/12 pre/post hooks — migration 0010, agent runner, source-group + host-default editor (AEAD-encrypted at rest, plaintext on the WS)
  • P2R-13 bandwidth caps — --limit-upload/--limit-download on every restic invocation; host-wide via config.update, per-job override via <details> disclosure on Run-now
  • P2R-14 schedule next/last — LatestJobBySchedule + cron Next() derivation, surfaced in dashboard row + schedules tab
  • P2-16 Windows service — svc.Handler + SCM install/uninstall/start/stop subcommands. Cross-compile verified; untested on Windows itself
  • P2-17 install.ps1 — pwsh installer mirroring install.sh, served from /install/
  • P2-18 announce-and-approve — Ed25519 nonce-sign WS, dashboard pending-hosts panel with copyable fingerprint + accept/reject, per-IP rate limit + 100-row global cap, 60s expiry sweeper

Decisions made on the operator's behalf

  • Per-job bandwidth UI as <details> disclosure rather than modal
  • Re-init dispatches a fresh init job (relies on restic's idempotent init); does not try to wipe S3/B2 buckets
  • Host-default hooks live on the Repo page (Settings tab still inert)
  • Admin still supplies repo creds at accept-pending time (same form as token-mint)
  • Windows service code committed but only compile-verified — first real Windows install will be the first end-to-end test

Test plan

  • go vet ./... && go test ./... -race (green)
  • Cross-compile: GOOS=windows GOARCH=amd64 go build ./cmd/agent
  • Playwright sweep against :8080 — dashboard / host detail / schedules / sources / repo / source-group edit all render with zero console errors (screenshots in _diag/p2-completion-sweep/)
  • Operator review of the announce-and-approve flow end-to-end (run an agent with no -enroll-token, watch fingerprint print, accept from UI)
  • Operator review of the Windows installer on a real Windows host
## Summary Closes every remaining P2 item in tasks.md. 15 commits on this branch; full test suite green under `-race`. - **P2R-09** auto-init UX — init status line in vitals strip, danger-zone re-init with typed-hostname confirm - **P2R-10/11/12** pre/post hooks — migration 0010, agent runner, source-group + host-default editor (AEAD-encrypted at rest, plaintext on the WS) - **P2R-13** bandwidth caps — `--limit-upload`/`--limit-download` on every restic invocation; host-wide via `config.update`, per-job override via `<details>` disclosure on Run-now - **P2R-14** schedule next/last — `LatestJobBySchedule` + cron `Next()` derivation, surfaced in dashboard row + schedules tab - **P2-16** Windows service — `svc.Handler` + SCM `install/uninstall/start/stop` subcommands. Cross-compile verified; **untested on Windows itself** - **P2-17** `install.ps1` — pwsh installer mirroring `install.sh`, served from `/install/` - **P2-18** announce-and-approve — Ed25519 nonce-sign WS, dashboard pending-hosts panel with copyable fingerprint + accept/reject, per-IP rate limit + 100-row global cap, 60s expiry sweeper ## Decisions made on the operator's behalf - Per-job bandwidth UI as `<details>` disclosure rather than modal - Re-init dispatches a fresh `init` job (relies on restic's idempotent init); does not try to wipe S3/B2 buckets - Host-default hooks live on the Repo page (Settings tab still inert) - Admin still supplies repo creds at accept-pending time (same form as token-mint) - Windows service code committed but only compile-verified — first real Windows install will be the first end-to-end test ## Test plan - [x] `go vet ./... && go test ./... -race` (green) - [x] Cross-compile: `GOOS=windows GOARCH=amd64 go build ./cmd/agent` - [x] Playwright sweep against :8080 — dashboard / host detail / schedules / sources / repo / source-group edit all render with zero console errors (screenshots in `_diag/p2-completion-sweep/`) - [ ] Operator review of the announce-and-approve flow end-to-end (run an agent with no `-enroll-token`, watch fingerprint print, accept from UI) - [ ] Operator review of the Windows installer on a real Windows host
steve added 15 commits 2026-05-04 14:15:37 +01:00
P2R-13a. restic.Env gains LimitUploadKBps/LimitDownloadKBps which are
emitted as global --limit-upload/--limit-download flags before the
subcommand on every invocation. Agent dispatcher tracks host-wide
caps received via config.update; server pushes them on hello and
after PUT /api/hosts/{id}/bandwidth.

Also extends api.CommandRunPayload with optional per-job overrides
(BandwidthUpKBps/Down + PreHook/PostHook); the override consumers
land in T2/T6.
P2R-13b. POST /hosts/{id}/source-groups/{gid}/run accepts optional
bandwidth_up_kbps / bandwidth_down_kbps form fields, plumbs them onto
CommandRunPayload. Agent dispatcher already prefers per-job override
over host-wide caps (T1). UI wraps the Run-now button in a form with
a <details> 'Limit bandwidth for this run' disclosure containing two
KB/s inputs.
P2R-14. New store.LatestJobBySchedule query (per-schedule fired job).
Schedules-tab handler computes next-fire from cron + last-fire from
the jobs table per row. Schedules table grows two columns; dashboard
host row prepends 'next 12h ago/from now' to the existing last-backup
line when a single covering schedule is the run-now candidate.

Embeds store.Schedule into scheduleRow so existing template field
references keep working without bulk renames.
Latest 'init' job status surfaced under the host-detail vitals strip
(succeeded/failed/running/queued, with link to the live job log on
non-success). New POST /hosts/{id}/repo/reinit handler dispatches a
fresh init job after the operator types the host name to confirm;
audit row records 'host.repo_reinit'.
Adds pre_hook/post_hook BLOB columns to source_groups and
pre_hook_default/post_hook_default to hosts. Bytes stored verbatim
(AEAD encrypt/decrypt happens at the HTTP layer where the AEAD key
lives). Round-trip tests cover set/clear semantics on both tables.
Agent: new runner.BackupHooks struct + runHook helper invoked via
/bin/sh -c (cmd.exe /C on Windows). pre_hook non-zero exit aborts
the backup; post_hook always runs with RM_JOB_STATUS=succeeded|failed
in env. Output streamed as 'hook(<phase>): …' log.stream lines.
Hooks only run for kind=backup (other kinds skip both phases).

Server: resolveBackupHooks resolves group → host default → empty,
decrypts via crypto.AEAD with per-slot ad bytes, plumbs plaintext
into CommandRunPayload for both schedule.fire and per-group
Run-now dispatch sites. Decrypt failures degrade silently to no
hook so a malformed blob can't poison every backup.
Source-group edit form gains pre/post hook textareas with a service-
user warning banner; bodies AEAD-encrypted on save (per-group AD).
Repo page adds a 'Host-default hooks' panel above the danger zone
with the same shape; saved via POST /hosts/{id}/repo/hooks.
migration 0011 adds pending_hosts table (id, hostname, public_key,
fingerprint, expiry). store/pending_hosts.go covers full CRUD plus
hostname-collision count + expired-row sweeper.

POST /api/agents/announce takes {hostname, os, arch, agent_version,
restic_version, public_key (base64)}, returns {pending_id,
fingerprint, hostname_collision}. Per-source-IP token-bucket
rate limit (10/min) + global cap of 100 in-flight rows. Public
key must be exactly 32 bytes (Ed25519).
GET /ws/agent/pending?pending_id=… runs an Ed25519 nonce-sign
handshake against the row's stored public key, then holds the
connection open. POST /api/pending-hosts/{id}/accept (admin)
mints a real Host row + bearer + AEAD-encrypted repo creds, pushes
the bearer down the open WS, deletes the pending row, and writes
a host.accept_pending audit entry. POST /api/pending-hosts/{id}/reject
closes the socket with code 4001 and audit-logs host.reject_pending.

In-memory pendingHub keyed by pending_id wires accept/reject to
their live socket.
When -enroll-server is supplied without -enroll-token, the agent
mints (and persists) an Ed25519 keypair, POSTs /api/agents/announce,
prints the SHA256 fingerprint in a copy-friendly banner, opens
/ws/agent/pending, signs the server's nonce, and blocks until the
admin clicks Accept (1h ceiling). On accept, persists the bearer +
host_id from the 'enrolled' message; on reject (close code 4001)
exits with a clear error.

Repo creds are pushed via config.update on the first standard WS
hello (P1-32 path), not in the enrolled message itself.
Dashboard handler loads ListPendingHosts(now); template renders a
warn-bordered panel above the host table with hostname, OS/arch,
fingerprint (selectable / copyable), source IP, age, expiry. Each
row carries an inline accept form (repo URL/user/password) plus a
Reject button. cmd/server adds a 60s ticker calling
DeleteExpiredPendingHosts so 1h-stale rows drop off.
internal/agent/service: build-tagged into service_windows.go (svc.Handler
that listens for Stop/Shutdown + delegates to the agent loop) and
service_other.go (foreground stub for Linux/macOS). install_windows.go
wraps mgr.Connect+CreateService/Delete/Start/Stop for the new
'restic-manager-agent install|uninstall|start|stop' subcommands.

Cross-compile verified: GOOS=windows GOARCH=amd64 go build ./cmd/agent
succeeds. UNTESTED on Windows itself — the SCM round-trip can't be
exercised from Linux CI; treat as a starting point for the first
real Windows install.
Pwsh installer that detects arch, downloads
$Server/agent/binary?os=windows&arch=amd64 to
C:\Program Files\restic-manager\, runs the agent in -enroll-server
[+ -enroll-token] mode (token flow OR announce-and-approve), then
calls 'restic-manager-agent install' to register the SCM service.
Surfaces existing scheduled tasks named *restic* without disabling.

CLAUDE.md restage block updated to also stage install.ps1 alongside
install.sh.
tasks: tick P2 completion + Playwright sweep screenshots
CI / Build (windows/amd64) (pull_request) Successful in 20s
CI / Lint (pull_request) Successful in 41s
CI / Build (linux/amd64) (pull_request) Successful in 21s
CI / Test (linux/amd64) (pull_request) Successful in 53s
CI / Build (linux/arm64) (pull_request) Successful in 1m48s
c691dc8a56
P2R-09/10/11/12/13/14, P2-16/17/18 all marked done. Acceptance line
for Windows hosts annotated as 'compile-verified, untested in CI'.

_diag/p2-completion-sweep/ holds the dashboard + host-detail +
schedules + sources + repo + source-group-edit screenshots from a
clean sweep against :8080. Zero console errors throughout.

announce_test.go: rate-limit + global-cap subtests dropped t.Parallel
to avoid racing on the package-level tunables under -race.
steve added 1 commit 2026-05-04 14:18:52 +01:00
docs: note Gitea repo + tea CLI in CLAUDE.md
CI / Build (windows/amd64) (pull_request) Successful in 19s
CI / Lint (pull_request) Successful in 21s
CI / Build (linux/amd64) (pull_request) Successful in 19s
CI / Build (linux/arm64) (pull_request) Successful in 19s
CI / Test (linux/amd64) (pull_request) Successful in 2m17s
bdabcfb68e
steve merged commit 0bd7a896c4 into main 2026-05-04 14:19:06 +01:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: steve/restic-manager#5