ee3ee241ea
Cohesive batch from a smoke-test session against a real rest-server.
Themed bullets:
* Agent runs as root, sandboxed via systemd. CapabilityBoundingSet
drops to CAP_DAC_READ_SEARCH + restore caps; ProtectSystem=strict
with ReadWritePaths confined to /etc + /var/lib/restic-manager;
NoNewPrivileges blocks escalation. Install script no longer
creates a service user. spec.md §4.2 / §14.1 / §14.3 explain the
rationale (matches UrBackup / Veeam / Bareos defaults; trying to
back up "everything" as an unprivileged user creates silent skips
on /home, /root, /var/lib/* with no upside vs the threat model
the agent already implies).
* Init-repo end-to-end. New JobKind="init" wired through agent
runner, restic.Env.RunInit, server dispatcher, and a UI button
(red "Initialise repo" in the run-now panel). hosts.repo_initialised_at
flips on init success, on backup success, or on a non-empty
snapshots.report. The "Run now" / "Init" / "Retry" branching now
drives both the dashboard host row and the host-detail panel.
Migrations 0004 (column), 0005 (jobs.kind CHECK widened — using
the safe create-new-then-rename pattern; first version corrupted
job_logs.job_id FK), 0006 (cleans up job_logs FK on already-
affected DBs).
* rest-server creds embedded at exec time only. restic.Env gains
RepoUsername; mergeRestCreds() builds the user:pass@-prefixed URL
inside envSlice() and never assigns it back to the struct, so
nothing slog-able ever sees the cleartext form. RedactURL helper
for any future surface that needs to log a URL safely. Both
helpers tested.
* Add-host UX. Repo password is now optional — server mints a
24-byte URL-safe random one and surfaces it once, alongside an
htpasswd snippet ("echo PASS | htpasswd -B -i ... USERNAME") so
the operator pastes one command on the rest-server host and one
on the endpoint. Result page also links the install snippet at
/install/install.sh (was /install.sh — 404'd before) and pipes
to bash (not sh — script uses set -o pipefail and other
bashisms; on Debian/Ubuntu sh is dash).
* Late-subscriber race in JobHub. A fast-failing job could finish
(DB write + Broadcast) before the browser's HX-Redirect → page
load → WS-connect path completed, so the JS sat forever waiting
on a job.finished that already passed. JobHub split into
Register + Send + Run; handleJobStream now subscribes first,
re-fetches the job, and sends a synthetic job.finished if the
state is already terminal.
* HTMX error visibility. New toast partial listens to
htmx:responseError and surfaces the response body as a
bottom-right toast — every server-side validation error now
becomes visible without per-handler JS wiring. Also handles
custom rm:toast events for future server-pushed notifications
via the HX-Trigger header. Themed via existing CSS vars.
* Dashboard rows are now whole-row clickable to host detail
(CSS card-link pattern: absolute-positioned anchor + .row-action
z-index restoration so the action button stays clickable).
"View →" on a running job links to /jobs/<id> rather than
/hosts/<id> since the row click already covers the host page.
* "Run first" / "Run first backup" → "Run now" everywhere for
consistency.
* runbook (docs/e2e-smoke.md) updated — live-log streaming step
now reflects P1-26; mentions the browser-driven Run-now flow.
* _diag/dump-creds — moved out of cmd/ so go build doesn't pick
it up; .gitignore now excludes /_diag/ entirely.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
285 lines
8.5 KiB
Markdown
285 lines
8.5 KiB
Markdown
# End-to-end smoke test (P1-34)
|
|
|
|
A runbook for verifying the Phase 1 happy path against a real
|
|
`restic/rest-server`. Run this on any Linux host with Docker; nothing
|
|
here touches your real Proxmox cluster or Unraid storage.
|
|
|
|
The test exercises:
|
|
|
|
1. Operator mints an enrollment token **with repo creds** (P1-32).
|
|
2. Agent enrols, server burns the token, host_credentials row lands.
|
|
3. Agent connects over WS, server pushes `config.update` containing
|
|
the decrypted creds **before** the agent sees any command.
|
|
4. Agent persists creds into `secrets.enc` (P1-33).
|
|
5. Run-now backup against the live `restic/rest-server`.
|
|
6. `snapshots.report` updates the per-host projection.
|
|
7. `GET /api/hosts/{id}/snapshots` returns the new snapshot.
|
|
|
|
Total time: ~5 minutes on a warm machine.
|
|
|
|
---
|
|
|
|
## Prereqs
|
|
|
|
- Docker + Docker Compose
|
|
- `restic` v0.16+ on the host running the agent (the agent does **not**
|
|
install it; that's a deliberate design choice — see spec §4.2)
|
|
- `curl`, `jq`
|
|
|
|
## Layout
|
|
|
|
Everything lives under `/tmp/rm-smoke/`. Nothing escapes it; remove the
|
|
directory to clean up.
|
|
|
|
```
|
|
/tmp/rm-smoke/
|
|
├── compose.yaml # rest-server + control-plane
|
|
├── data/ # control-plane SQLite + secret key
|
|
│ └── agent-binaries/ # built agent binaries served by /agent/binary
|
|
├── rest/ # rest-server data volume
|
|
│ └── htpasswd
|
|
└── agent/ # this host plays the part of an endpoint
|
|
├── etc/ # → bind-mounted as /etc/restic-manager
|
|
└── var-lib/ # → bind-mounted as /var/lib/restic-manager
|
|
```
|
|
|
|
## 1. Build the binaries
|
|
|
|
```sh
|
|
mkdir -p /tmp/rm-smoke/data/agent-binaries
|
|
cd ~/src/restic-manager
|
|
make build
|
|
cp bin/restic-manager-agent /tmp/rm-smoke/data/agent-binaries/restic-manager-agent-linux-amd64
|
|
```
|
|
|
|
The server's `/agent/binary?os=linux&arch=amd64` resolves to that path.
|
|
|
|
## 2. Compose the stack
|
|
|
|
`/tmp/rm-smoke/compose.yaml`:
|
|
|
|
```yaml
|
|
services:
|
|
rest-server:
|
|
image: restic/rest-server:latest
|
|
restart: unless-stopped
|
|
environment:
|
|
- OPTIONS=--no-auth # smoke-test only; real deploys use --append-only + htpasswd
|
|
ports:
|
|
# Mapped to 8100 because most dev boxes already have something
|
|
# on 8000. Use any free port; just keep the URLs below in sync.
|
|
- "127.0.0.1:8100:8000"
|
|
volumes:
|
|
- ./rest:/data
|
|
|
|
control-plane:
|
|
image: ghcr.io/dcglab/restic-manager:dev # or build locally; see §1
|
|
restart: unless-stopped
|
|
ports:
|
|
- "127.0.0.1:8080:8080"
|
|
volumes:
|
|
- ./data:/data
|
|
environment:
|
|
- RM_LISTEN=:8080
|
|
- RM_DATA_DIR=/data
|
|
- RM_BASE_URL=http://127.0.0.1:8080
|
|
- RM_SECRET_KEY_FILE=/data/secret.key
|
|
- RM_COOKIE_SECURE=false # smoke-test only — we're on plain HTTP
|
|
```
|
|
|
|
For local-only smoke: skip the image and run the server straight from
|
|
the binary instead, pointing at `/tmp/rm-smoke/data`:
|
|
|
|
```sh
|
|
RM_LISTEN=:8080 RM_DATA_DIR=/tmp/rm-smoke/data \
|
|
RM_SECRET_KEY_FILE=/tmp/rm-smoke/data/secret.key \
|
|
RM_COOKIE_SECURE=false \
|
|
./bin/restic-manager-server
|
|
```
|
|
|
|
Either way, watch stderr for the **bootstrap token** — printed on first
|
|
run, used in the next step.
|
|
|
|
## 3. Bootstrap the admin account
|
|
|
|
```sh
|
|
BOOTSTRAP_TOKEN='<paste from server logs>'
|
|
curl -s -X POST http://127.0.0.1:8080/api/bootstrap \
|
|
-H 'content-type: application/json' \
|
|
-d "{\"token\":\"$BOOTSTRAP_TOKEN\",\"username\":\"admin\",\"password\":\"correct horse battery staple\"}"
|
|
```
|
|
|
|
## 4. Mint an enrollment token (with repo creds)
|
|
|
|
```sh
|
|
curl -s -c /tmp/rm-smoke/cookies -X POST http://127.0.0.1:8080/api/auth/login \
|
|
-H 'content-type: application/json' \
|
|
-d '{"username":"admin","password":"correct horse battery staple"}'
|
|
|
|
ENROLL=$(curl -s -b /tmp/rm-smoke/cookies -X POST http://127.0.0.1:8080/api/enrollment-tokens \
|
|
-H 'content-type: application/json' \
|
|
-d '{
|
|
"hostname":"smoke-host",
|
|
"repo_url":"rest:http://127.0.0.1:8100/smoke/",
|
|
"repo_username":"",
|
|
"repo_password":"smoke-pw"
|
|
}')
|
|
TOKEN=$(echo "$ENROLL" | jq -r .token)
|
|
echo "token: $TOKEN"
|
|
```
|
|
|
|
If the server rejects with `missing_field`, you forgot
|
|
`repo_url`/`repo_password` — both are required (P1-32).
|
|
|
|
## 5. Initialise the rest-server repo
|
|
|
|
`restic/rest-server` will lazy-create the path on first write, but
|
|
restic itself wants the repo initialised:
|
|
|
|
```sh
|
|
RESTIC_PASSWORD=smoke-pw \
|
|
restic -r rest:http://127.0.0.1:8100/smoke/ init
|
|
```
|
|
|
|
## 6. Pretend to be a fresh endpoint
|
|
|
|
The agent will write `agent.yaml` + `secrets.enc` under
|
|
`/tmp/rm-smoke/agent/etc` and `/tmp/rm-smoke/agent/var-lib`. We point
|
|
both at those dirs to keep the smoke run isolated from your real
|
|
`/etc/restic-manager`.
|
|
|
|
```sh
|
|
mkdir -p /tmp/rm-smoke/agent/etc /tmp/rm-smoke/agent/var-lib
|
|
CONFIG=/tmp/rm-smoke/agent/etc/agent.yaml
|
|
|
|
# Pre-write the secrets path so we don't hit the system default.
|
|
cat > "$CONFIG" <<EOF
|
|
secrets_path: /tmp/rm-smoke/agent/var-lib/secrets.enc
|
|
EOF
|
|
|
|
# Enroll. This call talks to the server, returns the persistent
|
|
# bearer, and writes server_url/host_id/agent_token/secrets_key
|
|
# back into agent.yaml. secrets.enc is empty until the first
|
|
# config.update push lands.
|
|
./bin/restic-manager-agent \
|
|
-config "$CONFIG" \
|
|
-enroll-server http://127.0.0.1:8080 \
|
|
-enroll-token "$TOKEN"
|
|
|
|
# Read off the host_id for later steps.
|
|
HOST_ID=$(grep host_id "$CONFIG" | awk '{print $2}' | tr -d '"')
|
|
echo "host id: $HOST_ID"
|
|
```
|
|
|
|
After enrolment, `agent.yaml` should contain `host_id:` (a ULID),
|
|
`agent_token:`, and `server_url:`. It will **not** contain
|
|
`secrets_key:` yet — that's minted on the first non-enroll start
|
|
of the agent (next step). It should **not** contain `repo_url:`
|
|
or `repo_password:` (those never appear in plaintext on disk).
|
|
|
|
```sh
|
|
cat "$CONFIG"
|
|
```
|
|
|
|
## 7. Run the agent
|
|
|
|
In a second terminal:
|
|
|
|
```sh
|
|
./bin/restic-manager-agent -config /tmp/rm-smoke/agent/etc/agent.yaml
|
|
```
|
|
|
|
You should see, in order:
|
|
|
|
```
|
|
agent starting host_id=01H… server=http://127.0.0.1:8080 …
|
|
ws agent connected protocol_version=…
|
|
ws agent: repo credentials updated via config.update
|
|
```
|
|
|
|
That last line confirms slice 1 + 2 of P1-32/33: the server pushed
|
|
the encrypted creds, the agent decrypted, persisted to `secrets.enc`,
|
|
and is now ready to back up. `secrets.enc` should now exist and be
|
|
0600. `agent.yaml` should now also contain a freshly-minted
|
|
`secrets_key:` (base64-encoded 32 bytes).
|
|
|
|
```sh
|
|
ls -l /tmp/rm-smoke/agent/var-lib/secrets.enc
|
|
```
|
|
|
|
## 8. Run a backup
|
|
|
|
Back in the first terminal:
|
|
|
|
```sh
|
|
JOB=$(curl -s -b /tmp/rm-smoke/cookies -X POST \
|
|
"http://127.0.0.1:8080/api/hosts/$HOST_ID/jobs" \
|
|
-H 'content-type: application/json' \
|
|
-d '{"kind":"backup","args":["/etc/hostname","/etc/os-release"]}')
|
|
JOB_ID=$(echo "$JOB" | jq -r .job_id)
|
|
echo "job: $JOB_ID"
|
|
```
|
|
|
|
The agent terminal will show restic chugging through two tiny files;
|
|
the server terminal will log the lifecycle (`mark job started` /
|
|
`mark job finished` / `snapshots refreshed count=1`).
|
|
|
|
For a browser-driven version of the same flow, log in at
|
|
`http://127.0.0.1:8080/` and click **Run now** on the host row — the
|
|
button posts to `/hosts/{id}/run-backup` and the response sets
|
|
`HX-Redirect` to the live log page, which subscribes to
|
|
`/api/jobs/{id}/stream` (P1-26) and tails `job.progress` / `log.stream`
|
|
until `job.finished` flips it to the final header.
|
|
|
|
## 9. Confirm the snapshot
|
|
|
|
```sh
|
|
curl -s -b /tmp/rm-smoke/cookies \
|
|
"http://127.0.0.1:8080/api/hosts/$HOST_ID/snapshots" | jq
|
|
```
|
|
|
|
Expect one snapshot with the two paths and a non-zero `size_bytes`.
|
|
|
|
## 10. Verify the redacted credential view (sanity)
|
|
|
|
```sh
|
|
curl -s -b /tmp/rm-smoke/cookies \
|
|
"http://127.0.0.1:8080/api/hosts/$HOST_ID/repo-credentials" | jq
|
|
```
|
|
|
|
Expect `{"repo_url":"rest:http://127.0.0.1:8100/smoke/","has_password":true}`.
|
|
The password is never returned over this endpoint.
|
|
|
|
## 11. Edit creds + verify push-on-update
|
|
|
|
```sh
|
|
curl -s -b /tmp/rm-smoke/cookies -X PUT \
|
|
"http://127.0.0.1:8080/api/hosts/$HOST_ID/repo-credentials" \
|
|
-H 'content-type: application/json' \
|
|
-d '{"repo_password":"new-smoke-pw"}'
|
|
```
|
|
|
|
Agent terminal should log `repo credentials updated via config.update`
|
|
again. (Backups will then fail until you also update the rest-server
|
|
auth — but that proves the push path is live.)
|
|
|
|
## Cleanup
|
|
|
|
```sh
|
|
docker compose -f /tmp/rm-smoke/compose.yaml down -v
|
|
rm -rf /tmp/rm-smoke
|
|
```
|
|
|
|
---
|
|
|
|
## What this runbook does NOT cover
|
|
|
|
These are intentionally out of scope for Phase 1; revisit when the
|
|
relevant tasks land:
|
|
|
|
- TLS termination at a reverse proxy (covered by P5-07 reference deployment)
|
|
- Append-only restic creds + separate prune credential (P2-06)
|
|
- Cancellation (P2)
|
|
- Schedule-driven backups (P2-01 onwards)
|
|
- Windows agent (P2-16/17)
|