Cohesive batch from a smoke-test session against a real rest-server.
Themed bullets:
* Agent runs as root, sandboxed via systemd. CapabilityBoundingSet
drops to CAP_DAC_READ_SEARCH + restore caps; ProtectSystem=strict
with ReadWritePaths confined to /etc + /var/lib/restic-manager;
NoNewPrivileges blocks escalation. Install script no longer
creates a service user. spec.md §4.2 / §14.1 / §14.3 explain the
rationale (matches UrBackup / Veeam / Bareos defaults; trying to
back up "everything" as an unprivileged user creates silent skips
on /home, /root, /var/lib/* with no upside vs the threat model
the agent already implies).
* Init-repo end-to-end. New JobKind="init" wired through agent
runner, restic.Env.RunInit, server dispatcher, and a UI button
(red "Initialise repo" in the run-now panel). hosts.repo_initialised_at
flips on init success, on backup success, or on a non-empty
snapshots.report. The "Run now" / "Init" / "Retry" branching now
drives both the dashboard host row and the host-detail panel.
Migrations 0004 (column), 0005 (jobs.kind CHECK widened — using
the safe create-new-then-rename pattern; first version corrupted
job_logs.job_id FK), 0006 (cleans up job_logs FK on already-
affected DBs).
* rest-server creds embedded at exec time only. restic.Env gains
RepoUsername; mergeRestCreds() builds the user:pass@-prefixed URL
inside envSlice() and never assigns it back to the struct, so
nothing slog-able ever sees the cleartext form. RedactURL helper
for any future surface that needs to log a URL safely. Both
helpers tested.
* Add-host UX. Repo password is now optional — server mints a
24-byte URL-safe random one and surfaces it once, alongside an
htpasswd snippet ("echo PASS | htpasswd -B -i ... USERNAME") so
the operator pastes one command on the rest-server host and one
on the endpoint. Result page also links the install snippet at
/install/install.sh (was /install.sh — 404'd before) and pipes
to bash (not sh — script uses set -o pipefail and other
bashisms; on Debian/Ubuntu sh is dash).
* Late-subscriber race in JobHub. A fast-failing job could finish
(DB write + Broadcast) before the browser's HX-Redirect → page
load → WS-connect path completed, so the JS sat forever waiting
on a job.finished that already passed. JobHub split into
Register + Send + Run; handleJobStream now subscribes first,
re-fetches the job, and sends a synthetic job.finished if the
state is already terminal.
* HTMX error visibility. New toast partial listens to
htmx:responseError and surfaces the response body as a
bottom-right toast — every server-side validation error now
becomes visible without per-handler JS wiring. Also handles
custom rm:toast events for future server-pushed notifications
via the HX-Trigger header. Themed via existing CSS vars.
* Dashboard rows are now whole-row clickable to host detail
(CSS card-link pattern: absolute-positioned anchor + .row-action
z-index restoration so the action button stays clickable).
"View →" on a running job links to /jobs/<id> rather than
/hosts/<id> since the row click already covers the host page.
* "Run first" / "Run first backup" → "Run now" everywhere for
consistency.
* runbook (docs/e2e-smoke.md) updated — live-log streaming step
now reflects P1-26; mentions the browser-driven Run-now flow.
* _diag/dump-creds — moved out of cmd/ so go build doesn't pick
it up; .gitignore now excludes /_diag/ entirely.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.5 KiB
End-to-end smoke test (P1-34)
A runbook for verifying the Phase 1 happy path against a real
restic/rest-server. Run this on any Linux host with Docker; nothing
here touches your real Proxmox cluster or Unraid storage.
The test exercises:
- Operator mints an enrollment token with repo creds (P1-32).
- Agent enrols, server burns the token, host_credentials row lands.
- Agent connects over WS, server pushes
config.updatecontaining the decrypted creds before the agent sees any command. - Agent persists creds into
secrets.enc(P1-33). - Run-now backup against the live
restic/rest-server. snapshots.reportupdates the per-host projection.GET /api/hosts/{id}/snapshotsreturns the new snapshot.
Total time: ~5 minutes on a warm machine.
Prereqs
- Docker + Docker Compose
resticv0.16+ on the host running the agent (the agent does not install it; that's a deliberate design choice — see spec §4.2)curl,jq
Layout
Everything lives under /tmp/rm-smoke/. Nothing escapes it; remove the
directory to clean up.
/tmp/rm-smoke/
├── compose.yaml # rest-server + control-plane
├── data/ # control-plane SQLite + secret key
│ └── agent-binaries/ # built agent binaries served by /agent/binary
├── rest/ # rest-server data volume
│ └── htpasswd
└── agent/ # this host plays the part of an endpoint
├── etc/ # → bind-mounted as /etc/restic-manager
└── var-lib/ # → bind-mounted as /var/lib/restic-manager
1. Build the binaries
mkdir -p /tmp/rm-smoke/data/agent-binaries
cd ~/src/restic-manager
make build
cp bin/restic-manager-agent /tmp/rm-smoke/data/agent-binaries/restic-manager-agent-linux-amd64
The server's /agent/binary?os=linux&arch=amd64 resolves to that path.
2. Compose the stack
/tmp/rm-smoke/compose.yaml:
services:
rest-server:
image: restic/rest-server:latest
restart: unless-stopped
environment:
- OPTIONS=--no-auth # smoke-test only; real deploys use --append-only + htpasswd
ports:
# Mapped to 8100 because most dev boxes already have something
# on 8000. Use any free port; just keep the URLs below in sync.
- "127.0.0.1:8100:8000"
volumes:
- ./rest:/data
control-plane:
image: ghcr.io/dcglab/restic-manager:dev # or build locally; see §1
restart: unless-stopped
ports:
- "127.0.0.1:8080:8080"
volumes:
- ./data:/data
environment:
- RM_LISTEN=:8080
- RM_DATA_DIR=/data
- RM_BASE_URL=http://127.0.0.1:8080
- RM_SECRET_KEY_FILE=/data/secret.key
- RM_COOKIE_SECURE=false # smoke-test only — we're on plain HTTP
For local-only smoke: skip the image and run the server straight from
the binary instead, pointing at /tmp/rm-smoke/data:
RM_LISTEN=:8080 RM_DATA_DIR=/tmp/rm-smoke/data \
RM_SECRET_KEY_FILE=/tmp/rm-smoke/data/secret.key \
RM_COOKIE_SECURE=false \
./bin/restic-manager-server
Either way, watch stderr for the bootstrap token — printed on first run, used in the next step.
3. Bootstrap the admin account
BOOTSTRAP_TOKEN='<paste from server logs>'
curl -s -X POST http://127.0.0.1:8080/api/bootstrap \
-H 'content-type: application/json' \
-d "{\"token\":\"$BOOTSTRAP_TOKEN\",\"username\":\"admin\",\"password\":\"correct horse battery staple\"}"
4. Mint an enrollment token (with repo creds)
curl -s -c /tmp/rm-smoke/cookies -X POST http://127.0.0.1:8080/api/auth/login \
-H 'content-type: application/json' \
-d '{"username":"admin","password":"correct horse battery staple"}'
ENROLL=$(curl -s -b /tmp/rm-smoke/cookies -X POST http://127.0.0.1:8080/api/enrollment-tokens \
-H 'content-type: application/json' \
-d '{
"hostname":"smoke-host",
"repo_url":"rest:http://127.0.0.1:8100/smoke/",
"repo_username":"",
"repo_password":"smoke-pw"
}')
TOKEN=$(echo "$ENROLL" | jq -r .token)
echo "token: $TOKEN"
If the server rejects with missing_field, you forgot
repo_url/repo_password — both are required (P1-32).
5. Initialise the rest-server repo
restic/rest-server will lazy-create the path on first write, but
restic itself wants the repo initialised:
RESTIC_PASSWORD=smoke-pw \
restic -r rest:http://127.0.0.1:8100/smoke/ init
6. Pretend to be a fresh endpoint
The agent will write agent.yaml + secrets.enc under
/tmp/rm-smoke/agent/etc and /tmp/rm-smoke/agent/var-lib. We point
both at those dirs to keep the smoke run isolated from your real
/etc/restic-manager.
mkdir -p /tmp/rm-smoke/agent/etc /tmp/rm-smoke/agent/var-lib
CONFIG=/tmp/rm-smoke/agent/etc/agent.yaml
# Pre-write the secrets path so we don't hit the system default.
cat > "$CONFIG" <<EOF
secrets_path: /tmp/rm-smoke/agent/var-lib/secrets.enc
EOF
# Enroll. This call talks to the server, returns the persistent
# bearer, and writes server_url/host_id/agent_token/secrets_key
# back into agent.yaml. secrets.enc is empty until the first
# config.update push lands.
./bin/restic-manager-agent \
-config "$CONFIG" \
-enroll-server http://127.0.0.1:8080 \
-enroll-token "$TOKEN"
# Read off the host_id for later steps.
HOST_ID=$(grep host_id "$CONFIG" | awk '{print $2}' | tr -d '"')
echo "host id: $HOST_ID"
After enrolment, agent.yaml should contain host_id: (a ULID),
agent_token:, and server_url:. It will not contain
secrets_key: yet — that's minted on the first non-enroll start
of the agent (next step). It should not contain repo_url:
or repo_password: (those never appear in plaintext on disk).
cat "$CONFIG"
7. Run the agent
In a second terminal:
./bin/restic-manager-agent -config /tmp/rm-smoke/agent/etc/agent.yaml
You should see, in order:
agent starting host_id=01H… server=http://127.0.0.1:8080 …
ws agent connected protocol_version=…
ws agent: repo credentials updated via config.update
That last line confirms slice 1 + 2 of P1-32/33: the server pushed
the encrypted creds, the agent decrypted, persisted to secrets.enc,
and is now ready to back up. secrets.enc should now exist and be
0600. agent.yaml should now also contain a freshly-minted
secrets_key: (base64-encoded 32 bytes).
ls -l /tmp/rm-smoke/agent/var-lib/secrets.enc
8. Run a backup
Back in the first terminal:
JOB=$(curl -s -b /tmp/rm-smoke/cookies -X POST \
"http://127.0.0.1:8080/api/hosts/$HOST_ID/jobs" \
-H 'content-type: application/json' \
-d '{"kind":"backup","args":["/etc/hostname","/etc/os-release"]}')
JOB_ID=$(echo "$JOB" | jq -r .job_id)
echo "job: $JOB_ID"
The agent terminal will show restic chugging through two tiny files;
the server terminal will log the lifecycle (mark job started /
mark job finished / snapshots refreshed count=1).
For a browser-driven version of the same flow, log in at
http://127.0.0.1:8080/ and click Run now on the host row — the
button posts to /hosts/{id}/run-backup and the response sets
HX-Redirect to the live log page, which subscribes to
/api/jobs/{id}/stream (P1-26) and tails job.progress / log.stream
until job.finished flips it to the final header.
9. Confirm the snapshot
curl -s -b /tmp/rm-smoke/cookies \
"http://127.0.0.1:8080/api/hosts/$HOST_ID/snapshots" | jq
Expect one snapshot with the two paths and a non-zero size_bytes.
10. Verify the redacted credential view (sanity)
curl -s -b /tmp/rm-smoke/cookies \
"http://127.0.0.1:8080/api/hosts/$HOST_ID/repo-credentials" | jq
Expect {"repo_url":"rest:http://127.0.0.1:8100/smoke/","has_password":true}.
The password is never returned over this endpoint.
11. Edit creds + verify push-on-update
curl -s -b /tmp/rm-smoke/cookies -X PUT \
"http://127.0.0.1:8080/api/hosts/$HOST_ID/repo-credentials" \
-H 'content-type: application/json' \
-d '{"repo_password":"new-smoke-pw"}'
Agent terminal should log repo credentials updated via config.update
again. (Backups will then fail until you also update the rest-server
auth — but that proves the push path is live.)
Cleanup
docker compose -f /tmp/rm-smoke/compose.yaml down -v
rm -rf /tmp/rm-smoke
What this runbook does NOT cover
These are intentionally out of scope for Phase 1; revisit when the relevant tasks land:
- TLS termination at a reverse proxy (covered by P5-07 reference deployment)
- Append-only restic creds + separate prune credential (P2-06)
- Cancellation (P2)
- Schedule-driven backups (P2-01 onwards)
- Windows agent (P2-16/17)