restic-manager

Author	SHA1	Message	Date
steve	c6f73f790d	ci: pull ci-runner-go from zot registry	2026-05-15 19:51:02 +00:00
steve	068f08d96d	ci: migrate release workflow to zot registry	2026-05-15 19:50:50 +00:00
steve	68276810ec	e2e: dump error-context.md to log on failure + bump upload-artifact The Playwright run produces error-context.md per failed test with a full DOM snapshot — useful for triaging UI test failures without round-tripping through downloaded artifacts. Cat it into the workflow log on failure. Also bump actions/upload-artifact v3 → v4. v3 uploads still return success on this Gitea runner but the artifacts don't surface through the API or UI; v4 is the correct version per the workflow header note.	2026-05-08 21:41:38 +01:00
steve	e8804922b5	e2e: extract Playwright report via docker cp instead of bind mount When the runner job runs inside a container, compose's relative `./playwright/playwright-report` resolves to a path that exists only inside the runner container, so the host's docker daemon silently bind-mounts an empty dir and the report never lands anywhere we can read. Drop the bind mounts; keep the playwright container around (--name e2e-pw, no --rm); after the test, `docker cp` the report and traces out into the runner's workspace volume so upload-artifact has something real to upload. The new test-results directory (Playwright traces, screenshots, videos) is also included so failure post-mortem doesn't need a re-run.	2026-05-08 21:36:09 +01:00
steve	a8026608ae	ci: force bash as default shell in container jobs When jobs run with `container:` set, Gitea Actions defaults to `sh -e` (dash on Ubuntu), so `set -euo pipefail` fails with "Illegal option -o pipefail". Pinning bash workflow-wide matches what the runner used pre-container and keeps existing scripts portable.	2026-05-08 21:10:33 +01:00
steve	6c23bdbe63	ci: run jobs in ci-runner-go container Pin every job to gitea.dcglab.co.uk/steve/ci-runner-go:2026-05-08 so Go, Node, and Docker tooling are already installed when the job starts. Drops three actions/setup-go invocations from ci.yml (redundant — Go is on PATH) and inherits Buildx + Compose v2 in e2e.yml and release.yml without per-job apt-installs. Recipe lives in steve/ci. Bump the date pin in lockstep across the three workflows when picking up a fresher image (e.g. when the Go floor moves).	2026-05-08 21:06:38 +01:00
steve	a087321570	e2e: build playwright image with --profile test --pull Without --profile test, `docker compose build` skips the playwright service (profiles: [test]) and the image is built on-demand by `compose run` instead. Across CI runs the Gitea runner caches the resulting tag, so a Dockerfile FROM bump (v1.50.0 → v1.59.1) is masked by the cached image — the container ends up with old browser binaries and Playwright's own version-mismatch check fails the suite. Pull base images on every build so the FROM tag wins.	2026-05-08 20:15:21 +01:00
steve	af2cb292b8	e2e: run health probe + Playwright on the compose network Gitea's act-style runners execute workflow steps inside a runner container, so compose's host port-publish (127.0.0.1:8080:8080) is not reachable from the steps. PR #23's e2e job timed out waiting for the server even though the container was up and listening. Move both the health probe and the Playwright run onto rmnet so they address the server as http://server:8080: * health probe: docker run --rm --network e2e_rmnet curlimages/curl * Playwright: new mcr.microsoft.com/playwright-based image, added as a profile-gated `playwright` service in compose.e2e.yml, invoked via `docker compose run --rm playwright`. Drops the setup-node + npm install runner steps.	2026-05-08 20:08:23 +01:00
steve	bb4ed3502d	P5: OSS readiness — docs site, contributor onboarding, e2e harness P5-01 — Documentation site under docs/book/ rendered with mdBook (downloaded via Makefile, same static-binary pattern as Tailwind). Structured chapters: getting started, concepts, operations, security, reference. `make docs` / `make docs-watch`. Generated output gitignored. P5-02 — CONTRIBUTING.md rewritten from placeholder to a full guide. CODE_OF_CONDUCT.md adapted from Contributor Covenant for a single-maintainer project. .gitea/issue_template/{bug,feature}.md and PULL_REQUEST_TEMPLATE.md. P5-04 — Six README screenshots captured live from a fresh server bootstrap (login, empty dashboard, add-host, alerts, settings, audit log). README rewritten to centre the screenshot grid and link out to the docs site. P5-05 — SECURITY.md with disclosure policy (3-day ack, 30-day default window), scope in/out, threat-model summary, operator hardening checklist. Mirrored as a docs-site chapter. P5-06 — End-to-end test harness. e2e/compose.e2e.yml brings up server + sibling Linux agent (alpine + restic) + restic/rest-server. Agent uses announce-and-approve so Playwright can drive the full operator flow: bootstrap → login → accept pending → backup → verify terminal status. Second spec scrapes /metrics to assert the P6-04 endpoint surface. .gitea/workflows/e2e.yml runs on every PR; local how-to in docs/e2e.md.	2026-05-08 20:08:23 +01:00
steve	ab7fee0ae7	ci(release): use DEV_TOKEN for registry login Release / Build + push image (push) Successful in 3m57s Details The auto-issued GITHUB_TOKEN lacks write:package scope on this Gitea instance, so the v0.9.0 tag build failed at docker login. Switch to the user-level DEV_TOKEN secret which has the correct scope.	2026-05-06 19:05:54 +01:00
steve	fb978ad10c	p5-03: docker-only release path (drop goreleaser) Single public deliverable per tag: a multi-arch server image, with cross-compiled agent binaries + install scripts + the systemd unit baked under /opt/restic-manager/dist/. The /agent/binary and /install/* handlers fall back from <DataDir>/... to that read-only path so a fresh container Just Works without first-run staging; operators can still drop a custom build into <DataDir>/ to override per-host. Architecture rationale: agent distribution already routes through the running server, so the release surface mirrors that — there's no second source of truth to keep in sync. Workflow .gitea/workflows/release.yml triggers on v..* tag-push (fan-out :vX.Y.Z / :X.Y / :X, plus :latest once MAJOR>=1) and workflow_dispatch (snapshot tag only). Pushes to the Gitea container registry on this instance. Both binaries grow main.commit + main.date ldflag targets. Makefile and Dockerfile fill them; release workflow forwards from gitea.sha plus a UTC timestamp. Spec : docs/superpowers/specs/2026-05-05-p5-03-docker-only-release.md Plan : docs/superpowers/plans/2026-05-05-p5-03-docker-only-release.md	2026-05-05 15:18:48 +01:00
steve	b2983aed52	ci: shard test job + cheap argon2 in test mode Test job was wall-clocked by `internal/server/http` (~156s on the self-hosted runner under -race). Two changes here cut that: 1. Matrix-shard the test job by package group: server-http, store, and "rest" (everything else, computed via `go list \| grep -v`). Each shard runs on its own runner so the heavy package isn't CPU-starved by siblings. 2. `auth.HashPassword` drops to cheap argon2id params (8 KiB / 1 iter / 1 lane) when `testing.Testing()` returns true. Production params are unchanged. VerifyPassword reads params from the encoded hash so cheap-params hashes verify identically — no test call sites need to change.	2026-05-05 08:40:50 +01:00
steve	e73c4bd96c	infra: remove provision-gitea-runner.sh (now lives with the infra team) The runner-provisioning script has been handed off to the infra agent, who will own it going forward. ci.yml's header comment is updated to point at "the infra team owns the script" rather than the in-repo path, but the runner expectations themselves stay the same — workflows still rely on the persistent volumes, pre-cloned actions, and host-installed golangci-lint that any compliant provisioning produces.	2026-05-04 10:19:09 +01:00
steve	bd460d7532	ci+infra: provisioning script for gitea runners + drop setup-go cache scripts/provision-gitea-runner.sh is a one-shot, idempotent host setup for an act_runner LXC. It mounts persistent host volumes for GOMODCACHE / GOCACHE / act-clones, pre-pulls the runner image, pre-clones the common GitHub actions, installs golangci-lint, and sets up a nightly cron to refresh the lot. Generic — no per-project state. With those persistent volumes in place, `cache: true` on actions/setup-go becomes a net negative — the action keeps tar-ing / un-tar-ing GOMODCACHE+GOCACHE through the Gitea cache backend on every job, adding ~10s per job and overwriting the volume contents. Drop it from all three jobs in ci.yml. Add a header comment block explaining the runner-side expectations and the Go version / build matrix / upload-artifact context for anyone reading later.	2026-05-04 09:40:27 +01:00
steve	d9c8da139c	ci: bump golangci-lint to v2.5.0 (Go 1.25-built binary) The v2.1.6 release binary is built with Go 1.24, and golangci-lint refuses to load a config targeting a newer toolchain than itself ('Go language version (go1.24) used to build golangci-lint is lower than the targeted Go version (1.25.0)'). go.mod is on 1.25, so the binary needs to be too. Locally this didn't bite because 'go install …@v2.1.6' compiled v2.1.6 against the local Go 1.25 toolchain; CI uses the prebuilt release tarball which carries the build-time Go version. v2.5.0 is the first v2.x line built with Go 1.25 — pin in lockstep with go.mod going forward.	2026-05-03 21:29:02 +01:00
steve	b6f8de1dcc	lint: drive baseline to zero, drop only-new-issues gate Cleanup pass over the repo so CI can enforce lint going forward without the only-new-issues escape hatch: * gofumpt -w across the tree (31 hits, all formatting) * misspell --fix (25 hits, US-locale spelling) — but reverted on api.JobCancelled = "cancelled" since that literal is the wire + DB CHECK constraint value, plus matched the case in store/fleet.go back to "cancelled" and added //nolint:misspell on both for the next time someone reaches for the auto-fix * Wrap every `defer rows.Close()` / `defer stmt.Close()` / `defer res.Body.Close()` in `defer func() { _ = .Close() }()` to satisfy errcheck without losing the close itself * websocket.Dial callers (1 prod, 4 tests) now capture + close the upgrade response Body — coder/websocket can return res with a nil Body on success, so the test deferred-closes guard against that * Annotate the two genuine-by-design nilerr cases with //nolint comments explaining why nil-on-error is the contract (cookie missing = no session; ctx cancelled mid-backoff = clean shutdown) * Add brief godoc on the 10 exported const groups + types that revive flagged (api.HostOS/HostArch/JobKind/JobStatus/LogStream/ ErrorCode, restic.EventKind, store.Role, web.FS) * Drop the unused (Server).userByID method Inline the unparam baseView(active) — every UI page is under the dashboard primary nav today Result: `golangci-lint run ./...` reports 0 issues. CI lint job no longer needs only-new-issues: true; X-06 follow-up entry in tasks.md removed.	2026-05-03 16:15:17 +01:00
steve	41c3ec7c6f	ci: migrate .golangci.yml to v2 schema + only-new-issues gate The bump from golangci-lint-action@v6 → v7 (which downloads the v2.x binary) was blocking CI lint with 'unsupported version of the configuration: ""' because .golangci.yml was still in the v1 schema. Migrate the config to v2: * version: "2" prelude * disable-all → default: none * linters-settings → linters.settings * gofumpt + goimports move into formatters.enable + formatters.settings * exclude-rules move into linters.exclusions.rules * gosimple drops (folded into staticcheck in v2) Fix the four lint hits in the new P2R-02 code: * host_bandwidth.go: convert hostBandwidthRequest directly to hostBandwidthView via type conversion (S1016) * ui_repo.go: drop unparam savedSection + status arguments from renderRepoPage (always "" / always 422 — split GET render from validation-fail render) * ui_schedules.go: gofumpt formatting on the scheduleEditPage struct Add only-new-issues: true to the lint job. The repo carries ~90 pre-existing findings (gofumpt drift × 31, misspell × 25, missing godoc × 10, bodyclose × 6, errcheck × 12, …) accumulated before lint was actually wired into CI. Without this gate, every PR would fail on baseline noise instead of its own changes. Track the cleanup as X-06 in tasks.md so the gate is temporary.	2026-05-03 15:00:24 +01:00
steve	9ac5088fde	P2R-02 slice 4: Repo tab — connection / bandwidth / maintenance Three independent forms on /hosts/{id}/repo so saving one section doesn't disturb the others: * Connection: edits repo URL, username, password (pre-filled from the redacted GET /api/hosts/{id}/repo-credentials view; password field shows masked stored-creds placeholder; blank password = keep existing). On save, encrypts and pushes config.update to a connected agent. * Bandwidth: host-wide upload/download caps (KB/s; blank = no cap) written via store.SetHostBandwidth. New REST endpoint PUT /api/hosts/{id}/bandwidth for JSON callers. * Maintenance: forget/prune/check cadences + check subset %, with per-row enabled toggles. Reuses cronParser for validation; auto-seeds the row if a host pre-dates the migration. Right-rail surfaces repo size, snapshot count, snapshots-by-tag breakdown (counted from existing snapshot tag rows), and an 'untagged snapshots are left alone' note. Danger-zone re-init button is rendered but disabled with a hint pointing at P2R-09 (real implementation lands there). Validation re-renders the page with the relevant form's banner and all other section state intact. Successful saves redirect with a ?saved=<section> query param so the page surfaces a small ✓ saved indicator on the relevant form. ci.yml: bump golangci-lint-action v6→v7 (separate change picked up in this commit).	2026-05-03 12:14:03 +01:00
steve	84914fd6c5	ci: only trigger on PRs into main Drop the push-to-main trigger; main is fast-forward only via PR, so the post-merge run was redundant.	2026-05-03 11:25:13 +01:00
steve	c019633b77	ci: fix race-trip in enrollment fixture + bump golangci-lint to v2.1.6 - host_credentials_test.go's CreateEnrollmentToken fixture passed 1<<20 as the TTL (third arg, time.Duration) — that's ~1ms in nanoseconds. Local non-race runs finished inside the window, but -race overhead blew the deadline so the token was already expired by the time GetEnrollmentTokenAttachments / ConsumeEnrollmentToken ran. Use time.Hour instead, which matches the spirit of a per-test fixture. - Lint pin v1.61.0 was built against Go 1.23 and refuses to load a config targeting newer toolchains. go.mod is on 1.25, so the lint step exited 3 ('the Go language version used to build golangci-lint is lower than the targeted Go version'). Bumping to v2.1.6, which supports Go 1.25. Both failures showed up only on the Gitea runner because local make target runs go test without -race and lint hadn't been re-run after the go.mod toolchain bump.	2026-05-03 11:13:22 +01:00
steve	f55747a281	phase 1 foundations: api types, store, crypto, auth Lands the bottom three layers of Phase 1: P1-08 internal/api: protocol_version + envelope + every WS message shape from spec.md §6.2 (Hello, Heartbeat, Job, Schedule, etc). Wire-format tests pin the JSON shape so a rename here breaks tests instead of silently breaking the agent. P1-02 + P1-03 internal/store: SQLite via modernc.org/sqlite, embed.FS + a tiny version table for hand-rolled migrations. 0001_initial.sql covers every table from spec.md §5 plus enrollment_tokens and host_schedule_version. Typed accessors for users / sessions / enrollment / audit. WAL + foreign_keys + busy_timeout on by default. P1-06 internal/crypto: XChaCha20-Poly1305 AEAD wrapper with per-message random nonce. Key file lifecycle (generate + refuse-to-overwrite, load with size validation). Optional additionalData binds ciphertext to the row that owns it. P1-04 internal/auth (partial — passwords + tokens; sessions middleware lands with the HTTP handlers): argon2id following RFC 9106 (64 MiB / t=3 / p=4 / 32B), constant-time verify. HashToken stores SHA-256 of session/agent/enrollment tokens so a stolen DB doesn't hand over credentials. Build floor moves to Go 1.25 (modernc.org/sqlite v1.50+ requires it); CI + Dockerfile + README updated. Markdown lint diagnostics on tasks.md cleared. All packages tested. ~70 new tests pass in <1s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 00:24:40 +01:00
steve	25aa001135	phase 0: project bootstrap P0-01 Go module + cmd/server + cmd/agent skeletons + internal/ tree P0-02 LICENSE (PolyForm NC 1.0.0), README, CONTRIBUTING P0-03 golangci-lint, pre-commit, .editorconfig, .gitignore P0-04 Gitea Actions CI: test (race+coverage), lint, cross-platform build matrix P0-05 Dockerfile.server (multi-stage, distroless/static), docker-compose.yml P0-06 Makefile with build/test/lint/fmt/run/release targets build, vet, test, and cross-compile to linux/{amd64,arm64} + windows/amd64 all verified locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 00:03:59 +01:00

22 Commits