Added new AI focused document for host onboarding
CI / Test (rest) (pull_request) Successful in 7s
CI / Test (store) (pull_request) Successful in 9s
CI / Lint (pull_request) Successful in 18s
CI / Build (windows/amd64) (pull_request) Successful in 15s
CI / Build (linux/amd64) (pull_request) Successful in 7s
CI / Build (linux/arm64) (pull_request) Successful in 51s
CI / Test (server-http) (pull_request) Successful in 1m29s
e2e / Playwright vs docker-compose (pull_request) Successful in 1m27s
CI / Test (rest) (pull_request) Successful in 7s
CI / Test (store) (pull_request) Successful in 9s
CI / Lint (pull_request) Successful in 18s
CI / Build (windows/amd64) (pull_request) Successful in 15s
CI / Build (linux/amd64) (pull_request) Successful in 7s
CI / Build (linux/arm64) (pull_request) Successful in 51s
CI / Test (server-http) (pull_request) Successful in 1m29s
e2e / Playwright vs docker-compose (pull_request) Successful in 1m27s
This commit is contained in:
@@ -0,0 +1,249 @@
|
|||||||
|
# Onboarding a new host — agent instructions
|
||||||
|
|
||||||
|
How an automation agent (with a username + password for the
|
||||||
|
restic-manager server) brings a new host fully online.
|
||||||
|
|
||||||
|
The flow is two roles:
|
||||||
|
|
||||||
|
- **Controller side**: the agent calls JSON APIs on the
|
||||||
|
restic-manager server. Needs network reach to the server, plus
|
||||||
|
username/password.
|
||||||
|
- **Target side**: the host being onboarded runs the install
|
||||||
|
script, which calls back to the server with the one-time token.
|
||||||
|
|
||||||
|
If the agent is *both* sides (e.g. it can SSH into the target),
|
||||||
|
it does steps 1–2 against the server and steps 3–4 against the
|
||||||
|
target. If the agent only controls the server, it stops at
|
||||||
|
step 2 and hands the install snippet to whoever owns the target.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Conventions
|
||||||
|
|
||||||
|
- Base URL: `$RM_SERVER` (e.g. `https://restic.lab.example`).
|
||||||
|
- Session cookie jar: persist `rm_session` between calls.
|
||||||
|
- All request/response bodies are JSON unless noted.
|
||||||
|
- On any non-2xx, response body is
|
||||||
|
`{"code": "...", "message": "..."}`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Login
|
||||||
|
|
||||||
|
```
|
||||||
|
POST $RM_SERVER/api/auth/login
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
{"username": "...", "password": "..."}
|
||||||
|
```
|
||||||
|
|
||||||
|
→ 200 with `{"user_id": "...", "role": "..."}` and a `Set-Cookie:
|
||||||
|
rm_session=...` (HttpOnly, 24h TTL). Persist the cookie; reuse
|
||||||
|
it on every subsequent call.
|
||||||
|
|
||||||
|
Required role for the next step: **operator** or **admin**.
|
||||||
|
A viewer-only login can read but cannot mint tokens.
|
||||||
|
|
||||||
|
Session expires at 24h. On 401 from a later call, re-login.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Mint an enrolment token
|
||||||
|
|
||||||
|
```
|
||||||
|
POST $RM_SERVER/api/enrollment-tokens
|
||||||
|
Cookie: rm_session=...
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
{
|
||||||
|
"hostname": "newhost.example",
|
||||||
|
"tags": ["prod", "london"], // optional
|
||||||
|
"repo_url": "rest:https://rest.example/newhost",
|
||||||
|
"repo_username": "...", // optional, for rest-server / S3
|
||||||
|
"repo_password": "...", // optional
|
||||||
|
"initial_paths": ["/etc", "/home", "/var/lib"] // optional; default source group
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
→ 200 with:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{ "token": "<RAW_ONE_TIME_TOKEN>", "expires_at": "2026-05-09T..." }
|
||||||
|
```
|
||||||
|
|
||||||
|
**Capture `token` immediately — the server only stores its hash
|
||||||
|
and will never return the raw value again.** TTL is 1 hour.
|
||||||
|
|
||||||
|
The repo creds you provided are encrypted under the token hash
|
||||||
|
and pre-attached to the host. The agent will fetch and store
|
||||||
|
them at enrol-time; you will not need to push them again.
|
||||||
|
|
||||||
|
If you lose the token before the install runs, mint a new one
|
||||||
|
(the existing one becomes irrelevant; you can leave it to expire
|
||||||
|
or revoke it via the UI).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Install on the target host
|
||||||
|
|
||||||
|
The install script is hosted by the server itself. Running on the
|
||||||
|
target:
|
||||||
|
|
||||||
|
### Linux
|
||||||
|
|
||||||
|
```
|
||||||
|
curl -fsSL $RM_SERVER/install/install.sh | \
|
||||||
|
sudo RM_SERVER=$RM_SERVER RM_TOKEN=<RAW_ONE_TIME_TOKEN> bash
|
||||||
|
```
|
||||||
|
|
||||||
|
What it does, end-to-end:
|
||||||
|
|
||||||
|
1. detects arch (amd64 / arm64)
|
||||||
|
2. downloads `$RM_SERVER/agent/binary?os=linux&arch=<arch>` to
|
||||||
|
`/usr/local/bin/restic-manager-agent`
|
||||||
|
3. creates `/etc/restic-manager/` and `/var/lib/restic-manager/`
|
||||||
|
(root:root, 0700)
|
||||||
|
4. calls `POST /api/agents/enroll` with the token; server returns
|
||||||
|
the persistent agent bearer + `host_id`, written to
|
||||||
|
`/etc/restic-manager/agent.env`
|
||||||
|
5. installs the systemd unit, `daemon-reload`, `enable --now`
|
||||||
|
6. surfaces any pre-existing restic cron/timer entries so the
|
||||||
|
operator can decide whether to disable them (script does
|
||||||
|
*not* touch them automatically)
|
||||||
|
|
||||||
|
The script is idempotent. Re-running on an already-enrolled host
|
||||||
|
is a no-op unless `RM_FORCE_REENROLL=1`.
|
||||||
|
|
||||||
|
The agent runs as **root** by design — fleet backup needs to
|
||||||
|
read every file on the system. See
|
||||||
|
`deploy/install/restic-manager-agent.service` for rationale.
|
||||||
|
|
||||||
|
### Windows
|
||||||
|
|
||||||
|
```
|
||||||
|
iwr $RM_SERVER/install/install.ps1 -UseBasicParsing | iex
|
||||||
|
# (or download + run; needs an elevated PowerShell)
|
||||||
|
# Required env: $env:RM_SERVER, $env:RM_TOKEN
|
||||||
|
```
|
||||||
|
|
||||||
|
Same flow, lays down a Windows service instead of a systemd unit.
|
||||||
|
|
||||||
|
### Manual / non-script enrolment
|
||||||
|
|
||||||
|
If the install script can't be used, the wire-level enrol call is:
|
||||||
|
|
||||||
|
```
|
||||||
|
POST $RM_SERVER/api/agents/enroll
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
{
|
||||||
|
"token": "<RAW_ONE_TIME_TOKEN>",
|
||||||
|
"hostname": "newhost.example",
|
||||||
|
"os": "linux", // linux | windows
|
||||||
|
"arch": "amd64", // amd64 | arm64
|
||||||
|
"agent_version": "...",
|
||||||
|
"restic_version": "..."
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
→ 200 with
|
||||||
|
`{"host_id": "...", "agent_token": "...", "cert_pin_sha256": "..."}`.
|
||||||
|
|
||||||
|
The agent_token goes into `/etc/restic-manager/agent.env` as
|
||||||
|
`RM_AGENT_TOKEN=...`; subsequent agent → server traffic uses
|
||||||
|
`Authorization: Bearer $RM_AGENT_TOKEN`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Verify the host is healthy
|
||||||
|
|
||||||
|
Poll until both conditions are true. Cap at ~5 minutes.
|
||||||
|
|
||||||
|
```
|
||||||
|
GET $RM_SERVER/api/hosts
|
||||||
|
Cookie: rm_session=...
|
||||||
|
```
|
||||||
|
|
||||||
|
→ array of host objects. Find the one with the matching hostname
|
||||||
|
and check:
|
||||||
|
|
||||||
|
- `"status": "online"` — agent connected to the WS heartbeat
|
||||||
|
- `"repo_status": "ready"` — `restic init` (or existing-config
|
||||||
|
detection) completed successfully
|
||||||
|
|
||||||
|
If `repo_status` settles on `"init_failed"`, the repo creds are
|
||||||
|
wrong or the repo URL is unreachable from the target. Inspect
|
||||||
|
the matching job log:
|
||||||
|
|
||||||
|
```
|
||||||
|
GET $RM_SERVER/api/hosts/<host_id>/jobs (most recent init job)
|
||||||
|
GET $RM_SERVER/api/jobs/<job_id> (full output)
|
||||||
|
```
|
||||||
|
|
||||||
|
Fix the creds with a creds-update call (see Settings → Repo on
|
||||||
|
the UI for the exact route — currently form-only) or revoke the
|
||||||
|
host and start over.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. (Optional) configure schedules
|
||||||
|
|
||||||
|
A new host gets one default source group covering `initial_paths`
|
||||||
|
(or `/etc`,`/home` if you didn't pass any) and **no schedule**.
|
||||||
|
Backups won't run until either:
|
||||||
|
|
||||||
|
- a schedule is attached (cron expression, retention, etc.), or
|
||||||
|
- you trigger an on-demand run via the source-group "Run now"
|
||||||
|
endpoint.
|
||||||
|
|
||||||
|
These are not yet exposed cleanly as JSON-only routes; if the
|
||||||
|
agent needs them, look at `internal/server/http/schedules*.go`
|
||||||
|
and `internal/server/http/source_groups*.go` — most are JSON-
|
||||||
|
capable, some are form-only with HTML 303 responses.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Failure modes — quick reference
|
||||||
|
|
||||||
|
| Symptom | Likely cause | Fix |
|
||||||
|
|---|---|---|
|
||||||
|
| `401` on `/api/enrollment-tokens` | session expired or viewer role | re-login as operator+ |
|
||||||
|
| install.sh fails at "enrol": HTTP 410 | token expired (>1h) or already used | mint a fresh token |
|
||||||
|
| Host shows `status=offline` after install | systemd unit didn't start; firewall blocks WS | `systemctl status restic-manager-agent`, check `$RM_SERVER` reachability |
|
||||||
|
| `repo_status=init_failed` | bad repo creds or URL | inspect init job log; fix creds; retry probe via `/hosts/{id}/repo/probe` |
|
||||||
|
| Token list grows with stale rows | normal — they expire at 1h | optional cleanup via `/hosts/enrollment-tokens/{hash}/revoke` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Minimum reproducible script
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
: "${RM_SERVER:?}" "${RM_USER:?}" "${RM_PASS:?}" "${RM_HOSTNAME:?}" \
|
||||||
|
"${RM_REPO_URL:?}" "${RM_REPO_USER:?}" "${RM_REPO_PASS:?}"
|
||||||
|
|
||||||
|
JAR=$(mktemp)
|
||||||
|
trap 'rm -f "$JAR"' EXIT
|
||||||
|
|
||||||
|
# 1. login
|
||||||
|
curl -fsS -c "$JAR" -H 'Content-Type: application/json' \
|
||||||
|
-d "{\"username\":\"$RM_USER\",\"password\":\"$RM_PASS\"}" \
|
||||||
|
"$RM_SERVER/api/auth/login" >/dev/null
|
||||||
|
|
||||||
|
# 2. mint token
|
||||||
|
TOKEN=$(curl -fsS -b "$JAR" -H 'Content-Type: application/json' \
|
||||||
|
-d "$(jq -nc \
|
||||||
|
--arg h "$RM_HOSTNAME" --arg u "$RM_REPO_USER" \
|
||||||
|
--arg p "$RM_REPO_PASS" --arg r "$RM_REPO_URL" \
|
||||||
|
'{hostname:$h, repo_url:$r, repo_username:$u, repo_password:$p}')" \
|
||||||
|
"$RM_SERVER/api/enrollment-tokens" | jq -r .token)
|
||||||
|
|
||||||
|
# 3. emit the install snippet for the target machine
|
||||||
|
cat <<EOF
|
||||||
|
Run on $RM_HOSTNAME (as root):
|
||||||
|
|
||||||
|
curl -fsSL $RM_SERVER/install/install.sh | \\
|
||||||
|
sudo RM_SERVER=$RM_SERVER RM_TOKEN=$TOKEN bash
|
||||||
|
EOF
|
||||||
|
```
|
||||||
Reference in New Issue
Block a user