a2398d0b667a9191e7f4a617d4950ea18d0ae7bb
33 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a2398d0b66 |
P3 follow-up: log download (txt + ndjson) on the live job page
The diff job's full output streams to the standard live job log page,
which can be a lot of text the operator wants to grep through or paste
into a ticket. Add a Download button.
Source of truth is the persisted job_logs table — works any time
(running or finished) and doesn't need to pause the live WS stream.
The download is 'everything the server has up to right now'; if the
operator wants a fuller snapshot of a still-running job, they hit
Download again.
- New endpoint GET /api/jobs/{id}/log.{txt,ndjson} (chi {format}
matcher constrained to the two known suffixes). Auth via session
cookie. 404 on unknown job.
- internal/server/http/job_download.go writeLogsText emits a small
header + 'HH:MM:SS.mmm TAG payload' rows mirroring what the live
page shows. writeLogsNDJSON emits one self-contained {seq,ts,stream,
payload} JSON object per line — appending stays valid (each line
stands alone), and the whole file pipes cleanly into jq. NDJSON is
newline-delimited JSON; not the same as a JSON array.
- web/templates/pages/job_detail.html grows two header buttons:
'Download log' (txt) + '.ndjson' ghost variant for tooling.
Tests cover the txt format (header + per-row shape), the ndjson
format (each line round-trips through json.Unmarshal), unknown job
404, unauthenticated 401.
|
||
|
|
e22b41d452 |
P3 sweep fixes: snap-row CSS, tree expand, --no-ownership drop, target path
Bug fixes from the Playwright sweep against the live smoke server:
1. Snapshot-picker layout. The .snap-row class was used in the wireframe
but never landed in web/styles/input.css; rows rendered as vertical
blocks instead of a 6-column grid. Added the token (mirrors host-row
shape with restore-specific column widths).
2. Tree expansion. hx-target='closest .tree-row + .tree-children' isn't
a valid HTMX selector — modifiers don't chain. Replaced HTMX-driven
expansion with a small window.__rmTreeToggle helper that uses plain
fetch + .tree-pair wrapper structure for trivial sibling lookup.
Caches loaded state per node.
3. --no-ownership flag dropped. Restic 0.17 introduced --no-ownership;
0.16 rejects it ('unknown flag') before doing any work. Since the
agent runs as root in the systemd unit, restored files keep their
original uid/gid either way and the parent dir is root-owned, so
the 'cp without sudo' rationale doesn't hold. Drop the flag entirely.
4. Default target dir moved to /var/lib/restic-manager/restore. The
systemd unit pins ReadWritePaths to /etc/restic-manager +
/var/lib/restic-manager (with ProtectSystem=strict making the rest
of /var read-only); writes to /var/restic-restore failed with
'read-only file system'.
5. Confirm summary HTML escaping. defaultTarget JS literal evaluates
to a string with literal angle brackets; insertion into innerHTML
must escape them. Added an inline HTML-escape pass.
tasks.md ticked for the Restore sub-phase with a sweep summary
covering the live end-to-end test.
|
||
|
|
1111124573 |
P3-09 + P3-X3: snapshot diff + recent-restores line
P3-09 — snapshot diff dispatcher.
- POST /api/hosts/{id}/snapshots/diff (and the unprefixed HTMX-form
variant) takes {snapshot_a, snapshot_b}, validates both belong to
the host (long id / short id / prefix match), checks the agent is
online, mints a JobDiff, ships command.run with DiffPayload, writes
a host.snapshot_diff audit row, returns HX-Redirect to the live
job page (or JSON {job_id, job_url} for REST callers).
- Two-snapshot guard: POSTing diff(a,a) returns 422.
- UI: small panel on the host_detail right rail (visible when the
host has 2+ snapshots) with two short-id inputs and a Diff button.
Output renders on the standard live job page where the operator
reads the per-line diff text directly.
P3-X3 — recent-restores line.
- hostChromeData grows RestoreStatus / RestoreAt / RestoreJobID
populated via store.LatestJobByKind(host_id, 'restore') (already
exists, used by the init line).
- host_chrome.html renders a small line below the existing init-status
one with status-coloured copy + a link to the job log. Hidden when
no restore has ever run on this host.
Tests:
- diff_test covers happy path (correct DiffPayload + HX-Redirect),
same-id rejection (422), unknown-id rejection (422). Adds a
seedTwoSnapshots helper since ReplaceHostSnapshots is atomic-swap
(calling seedSnapshot twice would only leave the second).
Restage block (CLAUDE.md) deferred to the end of the restore phase.
|
||
|
|
6e47efc146 |
P3-01/02/03: restore wizard backend + templates + restore-shaped job page
End-to-end wizard from /hosts/{id}/restore (or per-snapshot deep link
/hosts/{id}/snapshots/{sid}/restore) → tree-browse → dispatch →
restore-shaped live job page.
Backend (internal/server/http/ui_restore.go):
- GET handlers render the four-step wizard against the wireframe shape
in docs/superpowers/specs/2026-05-04-p3-restore-design.md.
- HTMX tree partial endpoint hits fetchTreeWithCache (P3-X2) so each
directory expansion is a sub-second cached lookup after the first
miss.
- POST validates: snapshot_id non-empty, ≥1 absolute path, in-place
mode requires confirm_hostname == host name, agent online. On error
re-renders the wizard with the operator's input intact. Happy path
mints a job_id, computes the new-directory target as
/var/restic-restore/<job-id>/ (operator can't escape the prefix —
server picks it), creates the job row, ships command.run with
kind=restore + RestorePayload, writes a host.restore audit row,
returns HX-Redirect (or 303) to the live job page.
Templates:
- host_restore.html: single-page progressively-enabled wizard matching
_diag/p3-restore-wizard wireframe. Form-state-driven JS computes a
running tally of selected paths and the step-4 confirm summary
client-side; the server re-renders on validation failure with form
fields preserved.
- partials/tree_node.html: recursive HTMX-served tree fragment.
- Top-level Restore button on host_detail right rail + per-snapshot
Restore action on snapshot rows replace the previous P3-stub.
Restore-shaped job page (job_detail.html):
- Progress widget rendered as a panel rather than a bare strip when
the job is active.
- Current-file display under the bar, updated from log.stream stdout
lines that look like absolute paths. Hidden for non-restore kinds.
Migration 0012:
- Add restore + diff to the jobs.kind CHECK. Rebuild required (SQLite
can't ALTER CHECK in place); follows the safe pattern from 0005.
Defensive: stash job_logs into a temp table before the rebuild and
INSERT OR IGNORE back afterwards so even if SQLite cascades on
DROP TABLE jobs the log history survives.
Tests:
- ui_restore_test covers GET step-1 render, GET pre-selected snapshot
summary card, POST missing snapshot, POST missing paths, POST
in-place wrong-hostname rejection (no command.run leaks to the
agent), POST happy path (HX-Redirect + correct payload + audit
row), POST against offline host returns 503.
Restage block (CLAUDE.md) deferred to the end of the restore phase.
|
||
|
|
bbdf631a01 |
ui+server: P2-18d pending hosts dashboard panel + expiry sweeper
Dashboard handler loads ListPendingHosts(now); template renders a warn-bordered panel above the host table with hostname, OS/arch, fingerprint (selectable / copyable), source IP, age, expiry. Each row carries an inline accept form (repo URL/user/password) plus a Reject button. cmd/server adds a 60s ticker calling DeleteExpiredPendingHosts so 1h-stale rows drop off. |
||
|
|
1d3661470f |
ui: P2R-12 hook editor — source-group form + host-default Repo section
Source-group edit form gains pre/post hook textareas with a service-
user warning banner; bodies AEAD-encrypted on save (per-group AD).
Repo page adds a 'Host-default hooks' panel above the danger zone
with the same shape; saved via POST /hosts/{id}/repo/hooks.
|
||
|
|
cce3cd8384 |
ui: P2R-09 auto-init UX — init line in chrome + danger-zone re-init
Latest 'init' job status surfaced under the host-detail vitals strip
(succeeded/failed/running/queued, with link to the live job log on
non-success). New POST /hosts/{id}/repo/reinit handler dispatches a
fresh init job after the operator types the host name to confirm;
audit row records 'host.repo_reinit'.
|
||
|
|
93ab0ae84f |
ui+server: schedule next-run / last-run on dashboard + schedules tab
P2R-14. New store.LatestJobBySchedule query (per-schedule fired job). Schedules-tab handler computes next-fire from cron + last-fire from the jobs table per row. Schedules table grows two columns; dashboard host row prepends 'next 12h ago/from now' to the existing last-backup line when a single covering schedule is the run-now candidate. Embeds store.Schedule into scheduleRow so existing template field references keep working without bulk renames. |
||
|
|
6589f23313 |
ui+server: per-job bandwidth override on Run-now
P2R-13b. POST /hosts/{id}/source-groups/{gid}/run accepts optional
bandwidth_up_kbps / bandwidth_down_kbps form fields, plumbs them onto
CommandRunPayload. Agent dispatcher already prefers per-job override
over host-wide caps (T1). UI wraps the Run-now button in a form with
a <details> 'Limit bandwidth for this run' disclosure containing two
KB/s inputs.
|
||
|
|
77a8590e3a |
ui: hx-swap none on Run-now + truthful save banner + tailwind rebuild
Add hx-swap="none" to the three Run-now buttons (check/prune/unlock) in host_repo.html to match the existing pattern on host_sources.html and host_schedules.html. Fix all-blank admin-credentials save to redirect without ?saved= query string so no false-positive banner is shown; strengthen the corresponding test to assert Location has no ?saved=. Rebuild CSS bundle via Tailwind to pick up max-w-[640px] JIT class. |
||
|
|
46ec123f95 |
ui: Slice E — admin creds form + run-now buttons + repo health panel
- hostRepoPage gains AdminURL/AdminUsername/HasAdminPassword, Online,
and StatsView (pre-dereferenced projection of host_repo_stats).
- loadHostRepoPage loads the admin slot (tolerating ErrNotFound),
hub.Connected, and stats (tolerating ErrNotFound).
- renderRepoPage gains an adminErr parameter; all callers updated.
- handleUIAdminCredentialsSave / handleUIAdminCredentialsDelete added
(form-POST handlers mirroring the repo-creds pattern, with audit).
- Routes /hosts/{id}/admin-credentials POST and /delete POST registered.
- Template: Admin credentials form after Connection, Run-now HTMX
buttons after Maintenance, Repo health stats panel in right rail.
- Tests: 9 new tests covering rendering, disabled states, save/delete
round-trips, audit rows, and idempotent delete.
|
||
|
|
fab99b4a38 |
P2R-02 slice 5: dashboard row Run-now uses covering schedule
Replace the placeholder 'Open →' link with a per-host Run-now
decision computed server-side once per render:
* If the host has exactly one enabled schedule whose source-group
set covers every group on the host → primary 'Run all groups'
button (HX-POST to that schedule's /run endpoint, fires every
backup the host knows about in one click).
* Otherwise (zero matches, multiple matches, or any ambiguity) →
ghost 'Open →' link to /hosts/{id}/sources, where the operator
picks per-group from the source-group rows.
dashboardPage.Hosts moves from []store.Host to []dashboardHostRow
to carry the precomputed RunAllScheduleID; host_row.html now reads
.Host.* and .RunAllScheduleID. Two extra store calls per host on
dashboard render — fine at fleet sizes we care about; if we ever
need to support thousands of hosts we'll batch these queries.
|
||
|
|
d62b173712 |
P2R-02 slice 4: Repo tab — connection / bandwidth / maintenance
Three independent forms on /hosts/{id}/repo so saving one section
doesn't disturb the others:
* Connection: edits repo URL, username, password (pre-filled from
the redacted GET /api/hosts/{id}/repo-credentials view; password
field shows masked stored-creds placeholder; blank password = keep
existing). On save, encrypts and pushes config.update to a
connected agent.
* Bandwidth: host-wide upload/download caps (KB/s; blank = no cap)
written via store.SetHostBandwidth. New REST endpoint
PUT /api/hosts/{id}/bandwidth for JSON callers.
* Maintenance: forget/prune/check cadences + check subset %, with
per-row enabled toggles. Reuses cronParser for validation;
auto-seeds the row if a host pre-dates the migration.
Right-rail surfaces repo size, snapshot count, snapshots-by-tag
breakdown (counted from existing snapshot tag rows), and an
'untagged snapshots are left alone' note.
Danger-zone re-init button is rendered but disabled with a hint
pointing at P2R-09 (real implementation lands there).
Validation re-renders the page with the relevant form's banner and
all other section state intact. Successful saves redirect with a
?saved=<section> query param so the page surfaces a small ✓ saved
indicator on the relevant form.
ci.yml: bump golangci-lint-action v6→v7 (separate change picked up
in this commit).
|
||
|
|
8b91d3037c |
P2R-02 follow-up: Run-now works on disabled schedules with confirm
CI / Test (linux/amd64) (pull_request) Successful in 33s
CI / Lint (pull_request) Failing after 15s
CI / Build (windows/amd64) (pull_request) Successful in 22s
CI / Build (linux/amd64) (pull_request) Successful in 23s
CI / Build (linux/arm64) (pull_request) Successful in 23s
Surface the Run-now button on every schedule when the host is online,
not just enabled ones. Disabled rows render the button as a non-primary
style + a HX-confirm dialog ("This schedule is paused — running it now
won't change that. Fire it once anyway?"); enabled rows keep the
zero-friction primary button.
Server-side, Run-now no longer short-circuits on !Enabled — it
dispatches the source groups inline rather than via dispatchScheduledJob
(which always bails on disabled schedules, since cron-tick semantics
are different from explicit operator intent). The audit-log entry
inside dispatchBackupForGroup still records every fire.
|
||
|
|
64d2fcf7a3 |
P2R-02 follow-up: clickable rows on Sources/Schedules + cron-preset tooltips
CI / Test (linux/amd64) (pull_request) Successful in 1m57s
CI / Lint (pull_request) Failing after 15s
CI / Build (windows/amd64) (pull_request) Successful in 22s
CI / Build (linux/amd64) (pull_request) Successful in 22s
CI / Build (linux/arm64) (pull_request) Successful in 22s
Aligns Sources and Schedules tab rows with the dashboard's row-click
UX: whole-row click navigates to the row's edit page (mirroring
.host-row.clickable). Drops the redundant Edit buttons; Run-now and
Delete remain in .row-action cells that sit above the row-link
overlay via z-index.
Schedule edit form's cron preset chips now carry human-readable
title= tooltips ("Every day at 03:00", "Every Sunday at 03:00", etc).
tasks.md gets a binding row-design rule covering all current and
future list-row templates, and the P2R-02 entry is split into the
six slices already agreed with the operator (slices 1–3 marked
done, 4 next).
|
||
|
|
67ca769686 |
P2R-02 slice 3: Schedules tab — slim list, new/edit form, delete, Run-now
CI / Test (linux/amd64) (pull_request) Failing after 44s
CI / Lint (pull_request) Failing after 13s
CI / Build (windows/amd64) (pull_request) Successful in 19s
CI / Build (linux/amd64) (pull_request) Successful in 19s
CI / Build (linux/arm64) (pull_request) Successful in 25s
Schedules list: status (enabled/paused) + cron + source-group tags + actions (Run-now when enabled+online, Edit, Delete). Run-now reuses dispatchScheduledJob — same path real cron fires take, so each referenced source group runs as its own backup with its own tag. Falls back to a 409 if the agent is offline. Schedule new/edit form: cron input with five preset chips (quick-pick @hourly / nightly / 6h / weekly / monthly), source-group multi-pick rendered as styled checkbox cards (visual state tracks the underlying box via a tiny inline script), enabled toggle. No paths/excludes/retention/kind on the schedule itself — those live on source groups now. Server-side validation re-renders with the operator's input + ticked groups intact. Every successful mutation calls pushScheduleSetAsync. Adds .schd-row, .preset-chip, .picker styles. |
||
|
|
dede74fd3a |
P2R-02 slice 2 follow-up: refuse to delete a host's last source group
CI / Test (linux/amd64) (pull_request) Failing after 45s
CI / Lint (pull_request) Failing after 12s
CI / Build (windows/amd64) (pull_request) Successful in 19s
CI / Build (linux/amd64) (pull_request) Successful in 19s
CI / Build (linux/arm64) (pull_request) Successful in 23s
Belt-and-braces: the UI now disables the Delete button when a group is the only one on the host (with a tooltip explaining why), and the server-side handler returns 409 if a curl/form-replay tries anyway. Every host needs at least one source group to be backup-able, so the 'last group on a fresh host' case is a meaningful accident to guard against. |
||
|
|
0ed9c3d1ec |
P2R-02 slice 2: Sources tab — list, new/edit form, delete, Run-now
Sources tab now lists every source group on the host with per-row
counts (used-by-N-schedules, snapshot count by tag), the v4
conflict tag (keep-* dimension that has no compatible cadence),
and Run-now / Edit / Delete actions. Run-now reuses the existing
HTMX-aware /hosts/{id}/source-groups/{gid}/run handler.
New /hosts/{id}/sources/new and /sources/{gid}/edit form: name +
includes/excludes textareas + the 3×2 keep-* retention grid +
retry-on-offline knobs. Server-side validation re-renders with the
operator's input intact; the inline conflict banner shows above the
retention grid when ConflictDimension is set.
Delete blocks (UI + server) when the group is referenced by any
schedule. Every successful mutation calls pushScheduleSetAsync so
an online agent re-arms within seconds.
Adds .src-row and .keep-cell to input.css for the row + retention
grid layout.
|
||
|
|
a535822ff3 |
P2R-02 slice 1: host-detail sub-tab skeleton
Extract header/vitals/sub-tabs into a host_chrome partial that every host-detail tab page renders. Sources / Schedules / Repo go from inert divs to real <a> links backed by stub pages that share the chrome and a 'coming next' body — slices 2/3/4 fill them in. Also re-establishes the version indicator (host_schedule_version vs agent's applied_schedule_version) in the header. Drops the legacy fat-schedule list/edit templates that referenced fields removed by the P2 redesign (Manual / Paths / RetentionPolicy on Schedule); the new templates land in slice 3. |
||
|
|
813158b3d6 |
P2 redesign · phase 2.5: tasks.md rewrite + UI patch-up
The store rewrite in
|
||
|
|
fdecde0d5c |
P2-05: forget command with retention policy
End-to-end forget plumbing — operator can create a forget schedule with keep-* values, agent runs restic forget --keep-* … on the schedule's cron (or via per-row Run-now), snapshot list shrinks, UI updates. * api.CommandRunPayload gains retention_policy json.RawMessage so the agent doesn't need a typed copy of the server-side struct. * restic.ForgetPolicy mirrors restic's --keep-* flags. Empty() reports zero dimensions; restic wrapper RunForget refuses to run an empty policy (would delete every snapshot). Does NOT pass --prune — pruning lives behind a separate admin-only credential (P2-06); forget just rewrites the snapshot index. * runner.RunForget mirrors RunBackup's envelope shape so the live log viewer works without special-casing. On success triggers reportSnapshots (forget shrinks the index, the host's snapshot count almost certainly changed). * cmd/agent dispatcher handles MsgCommandRun with kind=forget, decodes RetentionPolicy from the wire, builds restic.ForgetPolicy. * Server dispatchScheduleNow marshals the schedule's RetentionPolicy into the wire payload for kind=forget jobs. Refuses to dispatch a forget schedule with empty retention. * validateSchedule rejects kind=forget without at least one keep-* dimension (new error code: missing_retention). * UI schedule edit form gains a Kind dropdown (backup or forget; immutable on edit). Paths block toggles by kind via inline data-kind attributes. Form help-text explains the prune separation. Other kinds (prune, check, unlock) deferred to P2-06..08; the Kind dropdown only offers backup and forget today. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f62a90b4b3 |
ui: stop Run-now buttons wrapping to two lines
Three sites: * Schedules list per-row Run-now / Edit / Delete column was 1fr next to a 1.3fr retention column — too narrow for the three buttons. Pin the action column to 240px and add whitespace-nowrap to each button so the layout can't squeeze them onto two lines regardless. * Dashboard host_row Run-now button got whitespace-nowrap + for the same reason inside the 92px action column. * Host detail header "Run backup now" — the words so the button never breaks across lines if the header gets crowded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
72d8081b0d |
Add-host: default repo username to hostname; always show htpasswd snippet
The pending page suppressed the htpasswd snippet when repo_username was blank — but with --private-repos the username is required for auth, and operators routinely leave the field blank assuming the system will pick something sensible. * handleUIAddHostPost defaults repo_username to the typed hostname when blank. Matches what --private-repos expects (URL path segment == username). * pending_host.html: snippet now renders whenever a password is present (always true after the generate-on-blank logic landed earlier). * Form help-text updated to describe the default explicitly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8a05969953 |
Add-host: durable pending page + polled awaiting-agent panel
Two issues from a smoke session:
1. The awaiting-agent panel never refreshed — operator had to go
back to the dashboard to see the host had connected.
2. Generated passwords were displayed only on the POST response.
Navigating away (or even an accidental tab close) lost them
permanently, so the operator couldn't update the rest-server's
htpasswd.
Both are the same fix: convert the POST-rendered transient
"result state" into a durable GET page at /hosts/pending/{token}.
* New route GET /hosts/pending/{token} renders the install-command +
htpasswd snippet view. Password is decrypted from the (still-
encrypted-at-rest) token row on every render — operator can
refresh, bookmark, navigate away and come back. Once the agent
enrols, the page redirects to /hosts/{id}; once the token
expires, redirect to /hosts/new.
* New route GET /hosts/pending/{token}/awaiting returns a polled
HTML fragment that the pending page swaps in every 2s via HTMX.
States: awaiting (keep polling) | connected (show "Open host →"
+ "View schedules" CTAs, polling stops) | expired (mint-new
link, polling stops). Polling stops naturally because only the
awaiting state's wrapper carries the hx-trigger attribute.
* POST /hosts/new now 303-redirects to /hosts/pending/{token}
on success; validation errors keep re-rendering the form with
banner.
Supporting changes:
* New store helper Store.GetEnrollmentTokenStatus(tokenHash) for
the polling endpoint — returns {expires_at, consumed_at,
consumed_host} in one round-trip without dragging in the
attachments-decryption path.
* New ui.Renderer.RenderPartial(w, name, data) for HTMX fragment
responses (no layout wrap). Picks an arbitrary page's template
set as the lookup point — every page parses the full common-
paths list, so they all see every partial.
* add_host.html stripped to form-only; pending_host.html owns the
result-state UI; awaiting_agent.html is the polled partial.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
148e61b33b |
P2-04.5: kill host.default_paths in favour of manual schedules
Two independent path lists for "what does this host back up?" was
a real divergence footgun — operator types one set at Add-host time
and a different set into a schedule, both end up in the same repo,
the snapshot history looks fine until restore. Resolution: drop
host.default_paths entirely; add a `manual` flag on schedules.
A manual schedule has paths/excludes/tags/retention like any other
but no cron — it fires only via per-schedule Run-now. Single source
of truth for what gets backed up.
Schema (migration 0007):
* schedules.manual INTEGER NOT NULL DEFAULT 0.
* For every host with non-empty default_paths, seed a manual
schedule with those paths and bump host_schedule_version.
* ALTER TABLE hosts DROP COLUMN default_paths.
* ALTER TABLE enrollment_tokens RENAME COLUMN default_paths
TO initial_paths.
Original draft of this migration rebuilt hosts via the
create-new + drop-old + rename-new pattern. With foreign_keys=ON
(set in the connection DSN), DROP TABLE on the parent fired
ON DELETE CASCADE on every child of hosts(id) — schedules /
jobs / snapshots / host_credentials all wiped on the smoke env
when I tried it. SQLite 3.35+ supports column-level ALTERs
directly, so we skip the rebuild dance and avoid the cascade
trap. Six lines of SQL instead of sixty, no FK risk.
Run-now rewiring:
* New `dispatchScheduleNow(hostID, scheduleID, conn?)` helper
unifies the agent-driven path (cron fire → schedule.fire →
OnScheduleFire callback) and the UI-driven path (operator
clicks Run-now on a schedule row). Conn arg is optional; nil
falls back to Hub.Send.
* New POST /hosts/{id}/schedules/{sid}/run endpoint — per-row
Run-now button on the schedules list.
* Dashboard's per-host Run-now (handleUIRunBackup) now picks the
host's only enabled manual schedule, falls back to the only
enabled schedule, else returns "pick one in Schedules tab".
Keeps one-click for the common case.
Agent:
* Scheduler skips manual schedules in cron build (silent — they're
a normal data shape, not an error).
* Wire Schedule struct gains Manual flag.
* Schedule.fire flow unchanged — the agent only ever fires
non-manual schedules anyway.
UI:
* Add-host form retitled "Initial schedule · manual" so the
operator knows the paths become an editable schedule under
the Schedules tab. Result page calls out the manual schedule
+ points at Host > Schedules.
* Schedule edit form: "Manual schedule" checkbox at the top of
the When section; toggling it hides/shows the cron field via
inline JS. Server-side validator skips the cron requirement
when manual=true.
* Schedule list shows a "manual" tag under the status pill and
renders the When column as "— run-now only —" for manual rows.
Each row gets a Run-now button when the schedule is enabled
and the host is online.
Tests + go test ./... green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
160d788bae |
P2-04: schedule editor UI
Closes the schedule foundations slice — operator can now drive the
plumbing P2-01..03 landed without touching the JSON API.
* New routes:
- GET /hosts/{id}/schedules (list)
- GET /hosts/{id}/schedules/new (create form)
- POST /hosts/{id}/schedules/new (create)
- GET /hosts/{id}/schedules/{sid}/edit (edit form)
- POST /hosts/{id}/schedules/{sid}/edit (update)
- POST /hosts/{id}/schedules/{sid}/delete (delete, confirm-then-redirect)
* List view (web/templates/pages/schedules_list.html):
status, cron, paths, retention summary, tags, edit/delete buttons.
Header shows "version N · agent in sync" or "agent at vM" when the
push hasn't been ack'd yet — backed by host_schedule_version +
applied_schedule_version. Empty-state CTA points at /schedules/new.
* Create/edit form (web/templates/pages/schedule_edit.html, shared):
cron expression with five quick-pick presets (daily 3am / every 6h
/ @hourly / weekly Sun / monthly 1st), paths textarea (one per
line), excludes textarea, tags (comma-separated), retention as six
numeric fields (mirrors restic's --keep-* flags one-for-one),
bandwidth caps, enabled toggle. Side panel explains the
reconciliation flow so the operator knows what saving actually
does. Validation errors re-render with operator's input intact.
* internal/server/http/ui_schedules.go owns the handlers; reuses
the same validateSchedule + pushScheduleSetAsync used by the JSON
API path. Each save audit-logs schedule.created / schedule.updated
/ schedule.deleted (matching the JSON API actions).
* store.RetentionPolicy gains a Summary() method ("last=7, d=14,
w=4" or "—"). Used by the list view's table cell so templates
don't have to do any conditional retention rendering.
* Two new template helpers: list (string varargs → []string, used
for the cron preset row) and joinComma (sibling to joinDot for
the rare list that wants commas). RetentionPolicy.Summary covers
the schedule-list case but the helpers are general.
* host_detail.html secondary tabs row converted from inert <div>s
into <a> links. Snapshots active by default; Schedules now points
at the new page. Jobs/Repo/Settings remain inert until their
P2 owners ship.
Hooks UI deferred to P2-15 (lands with the hook execution path).
Single-kind UI (backup only) by design — other kinds get a UI when
their job dispatch lands in P2-05..08.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
ee3ee241ea |
P1 polish: agent-as-root, init-repo flow, rest creds passthrough, UX fixes
Cohesive batch from a smoke-test session against a real rest-server.
Themed bullets:
* Agent runs as root, sandboxed via systemd. CapabilityBoundingSet
drops to CAP_DAC_READ_SEARCH + restore caps; ProtectSystem=strict
with ReadWritePaths confined to /etc + /var/lib/restic-manager;
NoNewPrivileges blocks escalation. Install script no longer
creates a service user. spec.md §4.2 / §14.1 / §14.3 explain the
rationale (matches UrBackup / Veeam / Bareos defaults; trying to
back up "everything" as an unprivileged user creates silent skips
on /home, /root, /var/lib/* with no upside vs the threat model
the agent already implies).
* Init-repo end-to-end. New JobKind="init" wired through agent
runner, restic.Env.RunInit, server dispatcher, and a UI button
(red "Initialise repo" in the run-now panel). hosts.repo_initialised_at
flips on init success, on backup success, or on a non-empty
snapshots.report. The "Run now" / "Init" / "Retry" branching now
drives both the dashboard host row and the host-detail panel.
Migrations 0004 (column), 0005 (jobs.kind CHECK widened — using
the safe create-new-then-rename pattern; first version corrupted
job_logs.job_id FK), 0006 (cleans up job_logs FK on already-
affected DBs).
* rest-server creds embedded at exec time only. restic.Env gains
RepoUsername; mergeRestCreds() builds the user:pass@-prefixed URL
inside envSlice() and never assigns it back to the struct, so
nothing slog-able ever sees the cleartext form. RedactURL helper
for any future surface that needs to log a URL safely. Both
helpers tested.
* Add-host UX. Repo password is now optional — server mints a
24-byte URL-safe random one and surfaces it once, alongside an
htpasswd snippet ("echo PASS | htpasswd -B -i ... USERNAME") so
the operator pastes one command on the rest-server host and one
on the endpoint. Result page also links the install snippet at
/install/install.sh (was /install.sh — 404'd before) and pipes
to bash (not sh — script uses set -o pipefail and other
bashisms; on Debian/Ubuntu sh is dash).
* Late-subscriber race in JobHub. A fast-failing job could finish
(DB write + Broadcast) before the browser's HX-Redirect → page
load → WS-connect path completed, so the JS sat forever waiting
on a job.finished that already passed. JobHub split into
Register + Send + Run; handleJobStream now subscribes first,
re-fetches the job, and sends a synthetic job.finished if the
state is already terminal.
* HTMX error visibility. New toast partial listens to
htmx:responseError and surfaces the response body as a
bottom-right toast — every server-side validation error now
becomes visible without per-handler JS wiring. Also handles
custom rm:toast events for future server-pushed notifications
via the HX-Trigger header. Themed via existing CSS vars.
* Dashboard rows are now whole-row clickable to host detail
(CSS card-link pattern: absolute-positioned anchor + .row-action
z-index restoration so the action button stays clickable).
"View →" on a running job links to /jobs/<id> rather than
/hosts/<id> since the row click already covers the host page.
* "Run first" / "Run first backup" → "Run now" everywhere for
consistency.
* runbook (docs/e2e-smoke.md) updated — live-log streaming step
now reflects P1-26; mentions the browser-driven Run-now flow.
* _diag/dump-creds — moved out of cmd/ so go build doesn't pick
it up; .gitignore now excludes /_diag/ entirely.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
12b72e7dde |
P1 polish: Host.default_paths interim + restic env hygiene + job_id JS quoting
Two fixes that close the loop on dashboard run-now and harden the
agent's restic invocation.
Default paths (interim until P2-01 schedules):
- 0003 migration adds default_paths TEXT NOT NULL DEFAULT '[]'
to hosts and to enrollment_tokens.
- Operator types paths in the Add-host form (textarea, one per
line). They ride on the enrol_token row alongside the
encrypted creds (paths aren't secret — plain JSON column).
- On consume, ConsumeEnrollmentToken still just burns the token;
the new GetEnrollmentTokenAttachments returns both the
re-bindable creds and the path list in one round trip, the
handler transfers them onto the new host row inside CreateHost.
- The dashboard's Run-now and host-detail's "Run backup now"
button now read Host.DefaultPaths and pass them to dispatchJob.
A host with no default paths returns 400 with a friendly
"no paths set" message instead of dispatching a doomed
`restic backup` with no positional args.
- Doc comments explicitly call this out as a Phase 1 interim —
schedules supersede.
Restic env hygiene:
- envSlice() previously omitted HOME / XDG_CACHE_HOME, which
bit the smoke runs whenever the agent was launched outside
systemd (restic refused to start: "neither $XDG_CACHE_HOME
nor $HOME are defined"). Now both are set explicitly: prefer
Env.ExtraEnv overrides, fall back to the agent process's own
HOME, and finally to /var/lib/restic-manager.
- Comment makes the env policy explicit: parent's RESTIC_* /
AWS_* / B2_* env is filtered out by design — control-plane
is the unambiguous source of truth.
JS bug fix in the live log page:
- {{$job.ID | printf "%q"}} produced a literal-quoted JS string,
which then went into the WS URL as ".../jobs/"<ID>"/stream"
→ 404. Switched to '{{$job.ID}}' inside the literal so
html/template's auto-escape does the right thing. Verified
end-to-end: dashboard "Run now" → live progress + log lines
arrive over the WS → succeeded pill renders.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
bd434bd1d0 |
P1-26: live job log viewer + WS browser fan-out hub
Closes the P1-21 remainder.
internal/server/ws/jobhub.go — new JobHub. Per-job_id set of
subscribers; each gets a 64-deep buffered channel with a writer
goroutine. Broadcast is non-blocking: if a subscriber is slow,
its channel fills and messages are dropped for that subscriber
only — the agent's read loop is never blocked by a stuck browser.
The agent dispatchAgentMessage path mirrors job.started /
job.progress / log.stream / job.finished envelopes onto the hub
in addition to its existing persistence work. The wire shape is
the same end-to-end, so client-side JS switches on env.type the
same way Go code does.
GET /api/jobs/{id}/stream is the browser endpoint. Auth via
session cookie (HTTP layer); upgrade; subscribe; pump until
context closes.
GET /jobs/{id} renders the live log page. Three states (queued/
running/succeeded/failed) drive the header pill, the progress
bar block, the failure summary panel, and the action button
(Cancel job while running, Back to host afterwards). Already-
persisted log lines are server-rendered on initial load; new
lines arrive over the WS and append to #log-stream. Auto-scrolls
unless the user scrolls up (a "⇢ Follow" pill re-attaches).
On job.finished the page reloads after 600ms to pick up the
final-state header rendered server-side.
POST /hosts/{id}/run-backup now sets HX-Redirect → /jobs/{job_id}
on success so HTMX lands the operator straight on the live log.
For non-HTMX callers (curl / plain form post) it 303s to the
same target.
store.ListJobLogs returns persisted log lines for initial render
on page load.
Browser-verified end-to-end: enrol → run a real backup against a
sibling restic/rest-server → live progress + 11 log lines stream
in → succeeded pill + final stats land after page reload.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
26a2b85e13 |
P1-25: host detail page (snapshots tab default)
GET /hosts/{id} renders the v1 host detail layout:
- persistent header: status dot (pulse if a job is in flight),
monospace name, tags, plus a metadata strip (os/arch, agent
version, restic version, "last seen Xs ago" or "online · last
heartbeat …").
- vitals strip: four tiles for last backup (status + relative
time), repo size, snapshot count, open alerts.
- sub-tabs: Snapshots is active; Jobs / Repo / Settings are
visible but inert until P2.
- snapshot table: short id, time (absolute), paths joined with
" · ", size, file count, restore button (disabled — wires up
in P3).
- right rail: run-now stack (backup live, forget/prune/check/
unlock disabled with the Phase tag), danger-zone remove panel
(also disabled for now).
Empty state: when a host has no snapshots yet, the table replaces
itself with a "no snapshots yet" prompt that includes the run-now
button (provided the agent is online).
Pagination cap of 50 most-recent snapshots; full pagination lands
when fleet sizes demand it.
Template helpers grew: comma() now accepts int / int32 / int64 so
templates don't fight Go's type inference; joinDot() concatenates
a []string with " · "; absTime() formats time.Time as
YYYY-MM-DD HH:MM:SS; the existing relTime() already accepts T or
*T after P1-27.
Browser-verified end-to-end with seeded fixture data.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
dad8c7fe99 |
P1-27: Add host flow — form + minted-token result page
GET /hosts/new renders the focused two-column form (hostname,
tags, repo URL/username/password). POST /hosts/new validates,
mints a one-time token via the new mintEnrollmentToken helper —
shared with the existing JSON /api/enrollment-tokens endpoint —
and re-renders the same page in result state showing:
- the install command with RM_SERVER + RM_TOKEN filled in (and
an inline copy-to-clipboard button),
- an "awaiting agent connection" panel with the hostname
pre-filled,
- a troubleshooting list pointing at the most common reasons
the agent doesn't appear,
- back-to-dashboard / add-another-host links.
publicURL() resolves RM_BASE_URL first, falling back to scheme +
Host on the inbound request — useful for local smoke without a
proxy.
Browser-verified end-to-end: form submit → token minted → install
command renders with the right values from the form input.
template fn formatRelTime now accepts time.Time *or* *time.Time
so templates can pass either without fighting Go's lack of an
address-of operator.
Deferred: download-preconfigured-installer (a templated .sh with
the values baked in) — copy-paste covers v1; nice-to-have later.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
ee16bc7ce7 |
P1-24: live dashboard — fleet summary tiles + host table
Server-rendered HTML view backed by:
- new store.FleetSummary aggregating host counts + repo bytes +
snapshot total + open alerts + last-24h job rollup in two queries.
- GET /api/hosts (JSON list of hosts in the dashboard projection).
- GET /api/fleet/summary (JSON aggregate, same shape as above).
The HTML page (web/templates/pages/dashboard.html) renders the four
summary tiles + host table directly from store data — no separate
fetch. Per-row state colour comes from .host-row.{degraded,failed,
offline} which paint a 3px left edge so problem hosts are scannable
without reading. HTMX is loaded into the base layout so per-row
"Run now" buttons can hx-post to /hosts/{id}/run-backup, a thin
HTML wrapper that funnels into a new dispatchJob helper shared
with the JSON /api/hosts/{id}/jobs endpoint.
Empty state (zero hosts) collapses to the "no hosts yet" prompt
with the + Add host CTA — matches the v1 mockup.
Template helpers (internal/server/ui/funcs.go) added for byte
formatting (412 GB / 3.7 TB), relative time (3m ago / 2d ago), and
comma grouping (1,847). Pure Go, no template-magic dependency.
Browser-verified end-to-end with seeded fixture data: five hosts
across all four states render with correct dots, accents, last-
backup pills, sizes, snapshot counts, alerts, tags, and the right
action button (Run now / Retry / Run first / View → / offline).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
229f89fee2 |
P1-23 / P1-28: base layout, login, session-aware nav + Tailwind build
P1-28: Tailwind standalone CLI wired into the Makefile. `make tailwind` downloads the pinned v3.4.17 binary into bin/tailwindcss (gitignored), builds web/styles/input.css → web/static/css/styles.css. `make build` now runs the CSS pass first; `make tailwind-watch` for dev. Output is embedded in the binary via web.FS — single static binary, no Node. The CSS source carries every component class the v1 mockups defined (status dots, buttons, host row, log viewer, progress bar, fields, chips, snippet panel, empty state) so screens that land later can just reach for them. P1-23: html/template tree at web/templates with two layouts (base with chrome, chromeless for login + bootstrap), one nav partial, and two pages (dashboard placeholder, login). internal/server/ui parses the tree at startup; ui_handlers.go in the http package wires: GET / dashboard (303 → /login when unauthed) GET /login sign-in form POST /login consume form, mint session cookie, 303 → / POST /logout drop cookie, 303 → /login GET /static/* embedded Tailwind bundle The HTML login flow shares store/session logic with /api/auth/login via a new authenticateAndSession helper — same security guarantees, two surface representations (HTML form / JSON). Verified end-to-end: bootstrap → form-login → authed dashboard → sign-out → 303 cycle works in the browser; Tailwind output emits only the component classes referenced in the live templates (9.6kB minified). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |