2418e585db
The smoke runbook caught a real bug: ConsumeEnrollmentToken was
inserting into host_credentials (FK -> hosts) inside the same tx as
the token burn, but the host row didn't exist yet — CreateHost
runs in the *next* statement. The agent saw a generic 401 with no
clue why.
Fix: drop the host_credentials insert from ConsumeEnrollmentToken;
the HTTP handler now does Consume -> CreateHost ->
SetHostCredentials. SetHostCredentials failure is logged loudly
but doesn't fail the enrol — operator recovers via PUT
/api/hosts/{id}/repo-credentials.
Adds slog.Warn lines on both 401 paths in handleAgentEnroll so the
underlying cause is visible in server logs (the wire response stays
generic to avoid leaking which step failed).
Test: TestEnrollmentTransfersRepoCreds rewritten to mirror the new
order (consume -> create host -> SetHostCredentials).
Runbook (docs/e2e-smoke.md): rest-server moved off 8000 (commonly
in use); URLs use trailing slash on the rest path; clarified that
secrets_key is minted on first agent start, not at enrol time.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
279 lines
8.2 KiB
Markdown
279 lines
8.2 KiB
Markdown
# End-to-end smoke test (P1-34)
|
|
|
|
A runbook for verifying the Phase 1 happy path against a real
|
|
`restic/rest-server`. Run this on any Linux host with Docker; nothing
|
|
here touches your real Proxmox cluster or Unraid storage.
|
|
|
|
The test exercises:
|
|
|
|
1. Operator mints an enrollment token **with repo creds** (P1-32).
|
|
2. Agent enrols, server burns the token, host_credentials row lands.
|
|
3. Agent connects over WS, server pushes `config.update` containing
|
|
the decrypted creds **before** the agent sees any command.
|
|
4. Agent persists creds into `secrets.enc` (P1-33).
|
|
5. Run-now backup against the live `restic/rest-server`.
|
|
6. `snapshots.report` updates the per-host projection.
|
|
7. `GET /api/hosts/{id}/snapshots` returns the new snapshot.
|
|
|
|
Total time: ~5 minutes on a warm machine.
|
|
|
|
---
|
|
|
|
## Prereqs
|
|
|
|
- Docker + Docker Compose
|
|
- `restic` v0.16+ on the host running the agent (the agent does **not**
|
|
install it; that's a deliberate design choice — see spec §4.2)
|
|
- `curl`, `jq`
|
|
|
|
## Layout
|
|
|
|
Everything lives under `/tmp/rm-smoke/`. Nothing escapes it; remove the
|
|
directory to clean up.
|
|
|
|
```
|
|
/tmp/rm-smoke/
|
|
├── compose.yaml # rest-server + control-plane
|
|
├── data/ # control-plane SQLite + secret key
|
|
│ └── agent-binaries/ # built agent binaries served by /agent/binary
|
|
├── rest/ # rest-server data volume
|
|
│ └── htpasswd
|
|
└── agent/ # this host plays the part of an endpoint
|
|
├── etc/ # → bind-mounted as /etc/restic-manager
|
|
└── var-lib/ # → bind-mounted as /var/lib/restic-manager
|
|
```
|
|
|
|
## 1. Build the binaries
|
|
|
|
```sh
|
|
mkdir -p /tmp/rm-smoke/data/agent-binaries
|
|
cd ~/src/restic-manager
|
|
make build
|
|
cp bin/restic-manager-agent /tmp/rm-smoke/data/agent-binaries/restic-manager-agent-linux-amd64
|
|
```
|
|
|
|
The server's `/agent/binary?os=linux&arch=amd64` resolves to that path.
|
|
|
|
## 2. Compose the stack
|
|
|
|
`/tmp/rm-smoke/compose.yaml`:
|
|
|
|
```yaml
|
|
services:
|
|
rest-server:
|
|
image: restic/rest-server:latest
|
|
restart: unless-stopped
|
|
environment:
|
|
- OPTIONS=--no-auth # smoke-test only; real deploys use --append-only + htpasswd
|
|
ports:
|
|
# Mapped to 8100 because most dev boxes already have something
|
|
# on 8000. Use any free port; just keep the URLs below in sync.
|
|
- "127.0.0.1:8100:8000"
|
|
volumes:
|
|
- ./rest:/data
|
|
|
|
control-plane:
|
|
image: ghcr.io/dcglab/restic-manager:dev # or build locally; see §1
|
|
restart: unless-stopped
|
|
ports:
|
|
- "127.0.0.1:8080:8080"
|
|
volumes:
|
|
- ./data:/data
|
|
environment:
|
|
- RM_LISTEN=:8080
|
|
- RM_DATA_DIR=/data
|
|
- RM_BASE_URL=http://127.0.0.1:8080
|
|
- RM_SECRET_KEY_FILE=/data/secret.key
|
|
- RM_COOKIE_SECURE=false # smoke-test only — we're on plain HTTP
|
|
```
|
|
|
|
For local-only smoke: skip the image and run the server straight from
|
|
the binary instead, pointing at `/tmp/rm-smoke/data`:
|
|
|
|
```sh
|
|
RM_LISTEN=:8080 RM_DATA_DIR=/tmp/rm-smoke/data \
|
|
RM_SECRET_KEY_FILE=/tmp/rm-smoke/data/secret.key \
|
|
RM_COOKIE_SECURE=false \
|
|
./bin/restic-manager-server
|
|
```
|
|
|
|
Either way, watch stderr for the **bootstrap token** — printed on first
|
|
run, used in the next step.
|
|
|
|
## 3. Bootstrap the admin account
|
|
|
|
```sh
|
|
BOOTSTRAP_TOKEN='<paste from server logs>'
|
|
curl -s -X POST http://127.0.0.1:8080/api/bootstrap \
|
|
-H 'content-type: application/json' \
|
|
-d "{\"token\":\"$BOOTSTRAP_TOKEN\",\"username\":\"admin\",\"password\":\"correct horse battery staple\"}"
|
|
```
|
|
|
|
## 4. Mint an enrollment token (with repo creds)
|
|
|
|
```sh
|
|
curl -s -c /tmp/rm-smoke/cookies -X POST http://127.0.0.1:8080/api/auth/login \
|
|
-H 'content-type: application/json' \
|
|
-d '{"username":"admin","password":"correct horse battery staple"}'
|
|
|
|
ENROLL=$(curl -s -b /tmp/rm-smoke/cookies -X POST http://127.0.0.1:8080/api/enrollment-tokens \
|
|
-H 'content-type: application/json' \
|
|
-d '{
|
|
"hostname":"smoke-host",
|
|
"repo_url":"rest:http://127.0.0.1:8100/smoke/",
|
|
"repo_username":"",
|
|
"repo_password":"smoke-pw"
|
|
}')
|
|
TOKEN=$(echo "$ENROLL" | jq -r .token)
|
|
echo "token: $TOKEN"
|
|
```
|
|
|
|
If the server rejects with `missing_field`, you forgot
|
|
`repo_url`/`repo_password` — both are required (P1-32).
|
|
|
|
## 5. Initialise the rest-server repo
|
|
|
|
`restic/rest-server` will lazy-create the path on first write, but
|
|
restic itself wants the repo initialised:
|
|
|
|
```sh
|
|
RESTIC_PASSWORD=smoke-pw \
|
|
restic -r rest:http://127.0.0.1:8100/smoke/ init
|
|
```
|
|
|
|
## 6. Pretend to be a fresh endpoint
|
|
|
|
The agent will write `agent.yaml` + `secrets.enc` under
|
|
`/tmp/rm-smoke/agent/etc` and `/tmp/rm-smoke/agent/var-lib`. We point
|
|
both at those dirs to keep the smoke run isolated from your real
|
|
`/etc/restic-manager`.
|
|
|
|
```sh
|
|
mkdir -p /tmp/rm-smoke/agent/etc /tmp/rm-smoke/agent/var-lib
|
|
CONFIG=/tmp/rm-smoke/agent/etc/agent.yaml
|
|
|
|
# Pre-write the secrets path so we don't hit the system default.
|
|
cat > "$CONFIG" <<EOF
|
|
secrets_path: /tmp/rm-smoke/agent/var-lib/secrets.enc
|
|
EOF
|
|
|
|
# Enroll. This call talks to the server, returns the persistent
|
|
# bearer, and writes server_url/host_id/agent_token/secrets_key
|
|
# back into agent.yaml. secrets.enc is empty until the first
|
|
# config.update push lands.
|
|
./bin/restic-manager-agent \
|
|
-config "$CONFIG" \
|
|
-enroll-server http://127.0.0.1:8080 \
|
|
-enroll-token "$TOKEN"
|
|
|
|
# Read off the host_id for later steps.
|
|
HOST_ID=$(grep host_id "$CONFIG" | awk '{print $2}' | tr -d '"')
|
|
echo "host id: $HOST_ID"
|
|
```
|
|
|
|
After enrolment, `agent.yaml` should contain `host_id:` (a ULID),
|
|
`agent_token:`, and `server_url:`. It will **not** contain
|
|
`secrets_key:` yet — that's minted on the first non-enroll start
|
|
of the agent (next step). It should **not** contain `repo_url:`
|
|
or `repo_password:` (those never appear in plaintext on disk).
|
|
|
|
```sh
|
|
cat "$CONFIG"
|
|
```
|
|
|
|
## 7. Run the agent
|
|
|
|
In a second terminal:
|
|
|
|
```sh
|
|
./bin/restic-manager-agent -config /tmp/rm-smoke/agent/etc/agent.yaml
|
|
```
|
|
|
|
You should see, in order:
|
|
|
|
```
|
|
agent starting host_id=01H… server=http://127.0.0.1:8080 …
|
|
ws agent connected protocol_version=…
|
|
ws agent: repo credentials updated via config.update
|
|
```
|
|
|
|
That last line confirms slice 1 + 2 of P1-32/33: the server pushed
|
|
the encrypted creds, the agent decrypted, persisted to `secrets.enc`,
|
|
and is now ready to back up. `secrets.enc` should now exist and be
|
|
0600. `agent.yaml` should now also contain a freshly-minted
|
|
`secrets_key:` (base64-encoded 32 bytes).
|
|
|
|
```sh
|
|
ls -l /tmp/rm-smoke/agent/var-lib/secrets.enc
|
|
```
|
|
|
|
## 8. Run a backup
|
|
|
|
Back in the first terminal:
|
|
|
|
```sh
|
|
JOB=$(curl -s -b /tmp/rm-smoke/cookies -X POST \
|
|
"http://127.0.0.1:8080/api/hosts/$HOST_ID/jobs" \
|
|
-H 'content-type: application/json' \
|
|
-d '{"kind":"backup","args":["/etc/hostname","/etc/os-release"]}')
|
|
JOB_ID=$(echo "$JOB" | jq -r .job_id)
|
|
echo "job: $JOB_ID"
|
|
```
|
|
|
|
The agent terminal will show restic chugging through two tiny files;
|
|
the server terminal will log the lifecycle (`mark job started` /
|
|
`mark job finished` / `snapshots refreshed count=1`).
|
|
|
|
## 9. Confirm the snapshot
|
|
|
|
```sh
|
|
curl -s -b /tmp/rm-smoke/cookies \
|
|
"http://127.0.0.1:8080/api/hosts/$HOST_ID/snapshots" | jq
|
|
```
|
|
|
|
Expect one snapshot with the two paths and a non-zero `size_bytes`.
|
|
|
|
## 10. Verify the redacted credential view (sanity)
|
|
|
|
```sh
|
|
curl -s -b /tmp/rm-smoke/cookies \
|
|
"http://127.0.0.1:8080/api/hosts/$HOST_ID/repo-credentials" | jq
|
|
```
|
|
|
|
Expect `{"repo_url":"rest:http://127.0.0.1:8100/smoke/","has_password":true}`.
|
|
The password is never returned over this endpoint.
|
|
|
|
## 11. Edit creds + verify push-on-update
|
|
|
|
```sh
|
|
curl -s -b /tmp/rm-smoke/cookies -X PUT \
|
|
"http://127.0.0.1:8080/api/hosts/$HOST_ID/repo-credentials" \
|
|
-H 'content-type: application/json' \
|
|
-d '{"repo_password":"new-smoke-pw"}'
|
|
```
|
|
|
|
Agent terminal should log `repo credentials updated via config.update`
|
|
again. (Backups will then fail until you also update the rest-server
|
|
auth — but that proves the push path is live.)
|
|
|
|
## Cleanup
|
|
|
|
```sh
|
|
docker compose -f /tmp/rm-smoke/compose.yaml down -v
|
|
rm -rf /tmp/rm-smoke
|
|
```
|
|
|
|
---
|
|
|
|
## What this runbook does NOT cover
|
|
|
|
These are intentionally out of scope for Phase 1; revisit when the
|
|
relevant tasks land:
|
|
|
|
- TLS termination at a reverse proxy (covered by P5-07 reference deployment)
|
|
- Append-only restic creds + separate prune credential (P2-06)
|
|
- Live job log streaming in a browser (P1-21 remainder; needs the UI)
|
|
- Cancellation (P2)
|
|
- Schedule-driven backups (P2-01 onwards)
|
|
- Windows agent (P2-16/17)
|