Files
restic-manager/docs/e2e-smoke.md
T
steve 44feb708bc fix: enrollment FK race + log-when-rejected; runbook fixes from dry-run
The smoke runbook caught a real bug: ConsumeEnrollmentToken was
inserting into host_credentials (FK -> hosts) inside the same tx as
the token burn, but the host row didn't exist yet — CreateHost
runs in the *next* statement. The agent saw a generic 401 with no
clue why.

Fix: drop the host_credentials insert from ConsumeEnrollmentToken;
the HTTP handler now does Consume -> CreateHost ->
SetHostCredentials. SetHostCredentials failure is logged loudly
but doesn't fail the enrol — operator recovers via PUT
/api/hosts/{id}/repo-credentials.

Adds slog.Warn lines on both 401 paths in handleAgentEnroll so the
underlying cause is visible in server logs (the wire response stays
generic to avoid leaking which step failed).

Test: TestEnrollmentTransfersRepoCreds rewritten to mirror the new
order (consume -> create host -> SetHostCredentials).

Runbook (docs/e2e-smoke.md): rest-server moved off 8000 (commonly
in use); URLs use trailing slash on the rest path; clarified that
secrets_key is minted on first agent start, not at enrol time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:01:59 +01:00

8.2 KiB

End-to-end smoke test (P1-34)

A runbook for verifying the Phase 1 happy path against a real restic/rest-server. Run this on any Linux host with Docker; nothing here touches your real Proxmox cluster or Unraid storage.

The test exercises:

  1. Operator mints an enrollment token with repo creds (P1-32).
  2. Agent enrols, server burns the token, host_credentials row lands.
  3. Agent connects over WS, server pushes config.update containing the decrypted creds before the agent sees any command.
  4. Agent persists creds into secrets.enc (P1-33).
  5. Run-now backup against the live restic/rest-server.
  6. snapshots.report updates the per-host projection.
  7. GET /api/hosts/{id}/snapshots returns the new snapshot.

Total time: ~5 minutes on a warm machine.


Prereqs

  • Docker + Docker Compose
  • restic v0.16+ on the host running the agent (the agent does not install it; that's a deliberate design choice — see spec §4.2)
  • curl, jq

Layout

Everything lives under /tmp/rm-smoke/. Nothing escapes it; remove the directory to clean up.

/tmp/rm-smoke/
├── compose.yaml            # rest-server + control-plane
├── data/                   # control-plane SQLite + secret key
│   └── agent-binaries/     # built agent binaries served by /agent/binary
├── rest/                   # rest-server data volume
│   └── htpasswd
└── agent/                  # this host plays the part of an endpoint
    ├── etc/                # → bind-mounted as /etc/restic-manager
    └── var-lib/            # → bind-mounted as /var/lib/restic-manager

1. Build the binaries

mkdir -p /tmp/rm-smoke/data/agent-binaries
cd ~/src/restic-manager
make build
cp bin/restic-manager-agent /tmp/rm-smoke/data/agent-binaries/restic-manager-agent-linux-amd64

The server's /agent/binary?os=linux&arch=amd64 resolves to that path.

2. Compose the stack

/tmp/rm-smoke/compose.yaml:

services:
  rest-server:
    image: restic/rest-server:latest
    restart: unless-stopped
    environment:
      - OPTIONS=--no-auth   # smoke-test only; real deploys use --append-only + htpasswd
    ports:
      # Mapped to 8100 because most dev boxes already have something
      # on 8000. Use any free port; just keep the URLs below in sync.
      - "127.0.0.1:8100:8000"
    volumes:
      - ./rest:/data

  control-plane:
    image: ghcr.io/dcglab/restic-manager:dev   # or build locally; see §1
    restart: unless-stopped
    ports:
      - "127.0.0.1:8080:8080"
    volumes:
      - ./data:/data
    environment:
      - RM_LISTEN=:8080
      - RM_DATA_DIR=/data
      - RM_BASE_URL=http://127.0.0.1:8080
      - RM_SECRET_KEY_FILE=/data/secret.key
      - RM_COOKIE_SECURE=false   # smoke-test only — we're on plain HTTP

For local-only smoke: skip the image and run the server straight from the binary instead, pointing at /tmp/rm-smoke/data:

RM_LISTEN=:8080 RM_DATA_DIR=/tmp/rm-smoke/data \
RM_SECRET_KEY_FILE=/tmp/rm-smoke/data/secret.key \
RM_COOKIE_SECURE=false \
./bin/restic-manager-server

Either way, watch stderr for the bootstrap token — printed on first run, used in the next step.

3. Bootstrap the admin account

BOOTSTRAP_TOKEN='<paste from server logs>'
curl -s -X POST http://127.0.0.1:8080/api/bootstrap \
  -H 'content-type: application/json' \
  -d "{\"token\":\"$BOOTSTRAP_TOKEN\",\"username\":\"admin\",\"password\":\"correct horse battery staple\"}"

4. Mint an enrollment token (with repo creds)

curl -s -c /tmp/rm-smoke/cookies -X POST http://127.0.0.1:8080/api/auth/login \
  -H 'content-type: application/json' \
  -d '{"username":"admin","password":"correct horse battery staple"}'

ENROLL=$(curl -s -b /tmp/rm-smoke/cookies -X POST http://127.0.0.1:8080/api/enrollment-tokens \
  -H 'content-type: application/json' \
  -d '{
    "hostname":"smoke-host",
    "repo_url":"rest:http://127.0.0.1:8100/smoke/",
    "repo_username":"",
    "repo_password":"smoke-pw"
  }')
TOKEN=$(echo "$ENROLL" | jq -r .token)
echo "token: $TOKEN"

If the server rejects with missing_field, you forgot repo_url/repo_password — both are required (P1-32).

5. Initialise the rest-server repo

restic/rest-server will lazy-create the path on first write, but restic itself wants the repo initialised:

RESTIC_PASSWORD=smoke-pw \
restic -r rest:http://127.0.0.1:8100/smoke/ init

6. Pretend to be a fresh endpoint

The agent will write agent.yaml + secrets.enc under /tmp/rm-smoke/agent/etc and /tmp/rm-smoke/agent/var-lib. We point both at those dirs to keep the smoke run isolated from your real /etc/restic-manager.

mkdir -p /tmp/rm-smoke/agent/etc /tmp/rm-smoke/agent/var-lib
CONFIG=/tmp/rm-smoke/agent/etc/agent.yaml

# Pre-write the secrets path so we don't hit the system default.
cat > "$CONFIG" <<EOF
secrets_path: /tmp/rm-smoke/agent/var-lib/secrets.enc
EOF

# Enroll. This call talks to the server, returns the persistent
# bearer, and writes server_url/host_id/agent_token/secrets_key
# back into agent.yaml. secrets.enc is empty until the first
# config.update push lands.
./bin/restic-manager-agent \
  -config "$CONFIG" \
  -enroll-server http://127.0.0.1:8080 \
  -enroll-token "$TOKEN"

# Read off the host_id for later steps.
HOST_ID=$(grep host_id "$CONFIG" | awk '{print $2}' | tr -d '"')
echo "host id: $HOST_ID"

After enrolment, agent.yaml should contain host_id: (a ULID), agent_token:, and server_url:. It will not contain secrets_key: yet — that's minted on the first non-enroll start of the agent (next step). It should not contain repo_url: or repo_password: (those never appear in plaintext on disk).

cat "$CONFIG"

7. Run the agent

In a second terminal:

./bin/restic-manager-agent -config /tmp/rm-smoke/agent/etc/agent.yaml

You should see, in order:

agent starting host_id=01H… server=http://127.0.0.1:8080 …
ws agent connected protocol_version=…
ws agent: repo credentials updated via config.update

That last line confirms slice 1 + 2 of P1-32/33: the server pushed the encrypted creds, the agent decrypted, persisted to secrets.enc, and is now ready to back up. secrets.enc should now exist and be 0600. agent.yaml should now also contain a freshly-minted secrets_key: (base64-encoded 32 bytes).

ls -l /tmp/rm-smoke/agent/var-lib/secrets.enc

8. Run a backup

Back in the first terminal:

JOB=$(curl -s -b /tmp/rm-smoke/cookies -X POST \
  "http://127.0.0.1:8080/api/hosts/$HOST_ID/jobs" \
  -H 'content-type: application/json' \
  -d '{"kind":"backup","args":["/etc/hostname","/etc/os-release"]}')
JOB_ID=$(echo "$JOB" | jq -r .job_id)
echo "job: $JOB_ID"

The agent terminal will show restic chugging through two tiny files; the server terminal will log the lifecycle (mark job started / mark job finished / snapshots refreshed count=1).

9. Confirm the snapshot

curl -s -b /tmp/rm-smoke/cookies \
  "http://127.0.0.1:8080/api/hosts/$HOST_ID/snapshots" | jq

Expect one snapshot with the two paths and a non-zero size_bytes.

10. Verify the redacted credential view (sanity)

curl -s -b /tmp/rm-smoke/cookies \
  "http://127.0.0.1:8080/api/hosts/$HOST_ID/repo-credentials" | jq

Expect {"repo_url":"rest:http://127.0.0.1:8100/smoke/","has_password":true}. The password is never returned over this endpoint.

11. Edit creds + verify push-on-update

curl -s -b /tmp/rm-smoke/cookies -X PUT \
  "http://127.0.0.1:8080/api/hosts/$HOST_ID/repo-credentials" \
  -H 'content-type: application/json' \
  -d '{"repo_password":"new-smoke-pw"}'

Agent terminal should log repo credentials updated via config.update again. (Backups will then fail until you also update the rest-server auth — but that proves the push path is live.)

Cleanup

docker compose -f /tmp/rm-smoke/compose.yaml down -v
rm -rf /tmp/rm-smoke

What this runbook does NOT cover

These are intentionally out of scope for Phase 1; revisit when the relevant tasks land:

  • TLS termination at a reverse proxy (covered by P5-07 reference deployment)
  • Append-only restic creds + separate prune credential (P2-06)
  • Live job log streaming in a browser (P1-21 remainder; needs the UI)
  • Cancellation (P2)
  • Schedule-driven backups (P2-01 onwards)
  • Windows agent (P2-16/17)