c8ead66f08
Cohesive batch from a smoke-test session against a real rest-server.
Themed bullets:
* Agent runs as root, sandboxed via systemd. CapabilityBoundingSet
drops to CAP_DAC_READ_SEARCH + restore caps; ProtectSystem=strict
with ReadWritePaths confined to /etc + /var/lib/restic-manager;
NoNewPrivileges blocks escalation. Install script no longer
creates a service user. spec.md §4.2 / §14.1 / §14.3 explain the
rationale (matches UrBackup / Veeam / Bareos defaults; trying to
back up "everything" as an unprivileged user creates silent skips
on /home, /root, /var/lib/* with no upside vs the threat model
the agent already implies).
* Init-repo end-to-end. New JobKind="init" wired through agent
runner, restic.Env.RunInit, server dispatcher, and a UI button
(red "Initialise repo" in the run-now panel). hosts.repo_initialised_at
flips on init success, on backup success, or on a non-empty
snapshots.report. The "Run now" / "Init" / "Retry" branching now
drives both the dashboard host row and the host-detail panel.
Migrations 0004 (column), 0005 (jobs.kind CHECK widened — using
the safe create-new-then-rename pattern; first version corrupted
job_logs.job_id FK), 0006 (cleans up job_logs FK on already-
affected DBs).
* rest-server creds embedded at exec time only. restic.Env gains
RepoUsername; mergeRestCreds() builds the user:pass@-prefixed URL
inside envSlice() and never assigns it back to the struct, so
nothing slog-able ever sees the cleartext form. RedactURL helper
for any future surface that needs to log a URL safely. Both
helpers tested.
* Add-host UX. Repo password is now optional — server mints a
24-byte URL-safe random one and surfaces it once, alongside an
htpasswd snippet ("echo PASS | htpasswd -B -i ... USERNAME") so
the operator pastes one command on the rest-server host and one
on the endpoint. Result page also links the install snippet at
/install/install.sh (was /install.sh — 404'd before) and pipes
to bash (not sh — script uses set -o pipefail and other
bashisms; on Debian/Ubuntu sh is dash).
* Late-subscriber race in JobHub. A fast-failing job could finish
(DB write + Broadcast) before the browser's HX-Redirect → page
load → WS-connect path completed, so the JS sat forever waiting
on a job.finished that already passed. JobHub split into
Register + Send + Run; handleJobStream now subscribes first,
re-fetches the job, and sends a synthetic job.finished if the
state is already terminal.
* HTMX error visibility. New toast partial listens to
htmx:responseError and surfaces the response body as a
bottom-right toast — every server-side validation error now
becomes visible without per-handler JS wiring. Also handles
custom rm:toast events for future server-pushed notifications
via the HX-Trigger header. Themed via existing CSS vars.
* Dashboard rows are now whole-row clickable to host detail
(CSS card-link pattern: absolute-positioned anchor + .row-action
z-index restoration so the action button stays clickable).
"View →" on a running job links to /jobs/<id> rather than
/hosts/<id> since the row click already covers the host page.
* "Run first" / "Run first backup" → "Run now" everywhere for
consistency.
* runbook (docs/e2e-smoke.md) updated — live-log streaming step
now reflects P1-26; mentions the browser-driven Run-now flow.
* _diag/dump-creds — moved out of cmd/ so go build doesn't pick
it up; .gitignore now excludes /_diag/ entirely.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
219 lines
7.5 KiB
Go
219 lines
7.5 KiB
Go
package api
|
|
|
|
import (
|
|
"encoding/json"
|
|
"time"
|
|
)
|
|
|
|
// HostOS / HostArch are constrained string types. The store stores them
|
|
// raw, but agent metadata collection should populate them from these
|
|
// constants so we don't end up with both "linux" and "Linux" rows.
|
|
type HostOS string
|
|
|
|
const (
|
|
OSLinux HostOS = "linux"
|
|
OSWindows HostOS = "windows"
|
|
)
|
|
|
|
type HostArch string
|
|
|
|
const (
|
|
ArchAmd64 HostArch = "amd64"
|
|
ArchArm64 HostArch = "arm64"
|
|
)
|
|
|
|
// HelloPayload is the agent's first message after WS auth. The server
|
|
// upserts a Host row, marks it online, and (if protocol_version is
|
|
// acceptable) responds with a config.update + schedule.set burst.
|
|
type HelloPayload struct {
|
|
ProtocolVersion int `json:"protocol_version"`
|
|
AgentVersion string `json:"agent_version"`
|
|
ResticVersion string `json:"restic_version"`
|
|
Hostname string `json:"hostname"`
|
|
OS HostOS `json:"os"`
|
|
Arch HostArch `json:"arch"`
|
|
BootTime time.Time `json:"boot_time,omitempty"`
|
|
}
|
|
|
|
// HeartbeatPayload is sent by the agent every 30s. It carries no data
|
|
// today; presence is the signal. Future fields (load, free disk) can
|
|
// land here without bumping protocol_version.
|
|
type HeartbeatPayload struct {
|
|
SentAt time.Time `json:"sent_at"`
|
|
}
|
|
|
|
// JobKind is the operation an agent is being asked to run, or just ran.
|
|
type JobKind string
|
|
|
|
const (
|
|
JobBackup JobKind = "backup"
|
|
JobInit JobKind = "init"
|
|
JobForget JobKind = "forget"
|
|
JobPrune JobKind = "prune"
|
|
JobCheck JobKind = "check"
|
|
JobUnlock JobKind = "unlock"
|
|
)
|
|
|
|
// JobStatus is the lifecycle state of a job.
|
|
type JobStatus string
|
|
|
|
const (
|
|
JobQueued JobStatus = "queued"
|
|
JobRunning JobStatus = "running"
|
|
JobSucceeded JobStatus = "succeeded"
|
|
JobFailed JobStatus = "failed"
|
|
JobCancelled JobStatus = "cancelled"
|
|
)
|
|
|
|
// CommandRunPayload is the server → agent dispatch for a run-now job.
|
|
type CommandRunPayload struct {
|
|
JobID string `json:"job_id"`
|
|
Kind JobKind `json:"kind"`
|
|
Args []string `json:"args,omitempty"`
|
|
}
|
|
|
|
// CommandCancelPayload is the server → agent cancel signal.
|
|
type CommandCancelPayload struct {
|
|
JobID string `json:"job_id"`
|
|
}
|
|
|
|
// CommandResultPayload acks a command.run dispatch (the agent has
|
|
// accepted the job and persisted it locally) — this is *not* the job
|
|
// completion. job.finished signals that.
|
|
type CommandResultPayload struct {
|
|
JobID string `json:"job_id"`
|
|
Accepted bool `json:"accepted"`
|
|
Error string `json:"error,omitempty"`
|
|
}
|
|
|
|
// JobStartedPayload — agent has begun execution.
|
|
type JobStartedPayload struct {
|
|
JobID string `json:"job_id"`
|
|
Kind JobKind `json:"kind"`
|
|
StartedAt time.Time `json:"started_at"`
|
|
}
|
|
|
|
// JobProgressPayload — agent's periodic status while a job is running.
|
|
// Field set chosen to match what restic --json emits for `backup`;
|
|
// other kinds populate the subset that makes sense.
|
|
type JobProgressPayload struct {
|
|
JobID string `json:"job_id"`
|
|
PercentDone float64 `json:"percent_done"`
|
|
FilesDone int64 `json:"files_done"`
|
|
TotalFiles int64 `json:"total_files"`
|
|
BytesDone int64 `json:"bytes_done"`
|
|
TotalBytes int64 `json:"total_bytes"`
|
|
ETASeconds int64 `json:"eta_seconds"`
|
|
ThroughputBps int64 `json:"throughput_bps"`
|
|
}
|
|
|
|
// JobFinishedPayload — agent reports terminal state.
|
|
type JobFinishedPayload struct {
|
|
JobID string `json:"job_id"`
|
|
Status JobStatus `json:"status"`
|
|
ExitCode int `json:"exit_code"`
|
|
FinishedAt time.Time `json:"finished_at"`
|
|
Stats json.RawMessage `json:"stats,omitempty"` // restic summary blob
|
|
Error string `json:"error,omitempty"`
|
|
}
|
|
|
|
// LogStreamLine is one entry of the live job log.
|
|
type LogStreamLine struct {
|
|
JobID string `json:"job_id"`
|
|
Seq int64 `json:"seq"`
|
|
TS time.Time `json:"ts"`
|
|
Stream LogStream `json:"stream"`
|
|
Payload string `json:"payload"`
|
|
}
|
|
|
|
// LogStream identifies which channel a log line came from.
|
|
type LogStream string
|
|
|
|
const (
|
|
LogStdout LogStream = "stdout"
|
|
LogStderr LogStream = "stderr"
|
|
LogEvent LogStream = "event" // parsed restic --json event
|
|
)
|
|
|
|
// SnapshotsReportPayload — agent dumps its full snapshot list after
|
|
// each successful backup, so the server can refresh its projection.
|
|
type SnapshotsReportPayload struct {
|
|
Snapshots []Snapshot `json:"snapshots"`
|
|
}
|
|
|
|
// Snapshot is the projection mirrored from `restic snapshots --json`.
|
|
// SizeBytes / FileCount come from the embedded summary block on
|
|
// restic 0.16+; older clients leave them at zero (the UI degrades
|
|
// gracefully).
|
|
type Snapshot struct {
|
|
ID string `json:"id"` // long restic snapshot ID
|
|
ShortID string `json:"short_id"` // 8-hex-char form
|
|
Time time.Time `json:"time"`
|
|
Hostname string `json:"hostname"`
|
|
Paths []string `json:"paths"`
|
|
Tags []string `json:"tags,omitempty"`
|
|
SizeBytes int64 `json:"size_bytes,omitempty"`
|
|
FileCount int64 `json:"file_count,omitempty"`
|
|
}
|
|
|
|
// RepoStatsPayload — agent reports periodic repo health facts derived
|
|
// from `restic stats` and lock-file inspection.
|
|
type RepoStatsPayload struct {
|
|
SizeBytes int64 `json:"size_bytes"`
|
|
SnapshotCount int `json:"snapshot_count"`
|
|
DedupRatio float64 `json:"dedup_ratio"`
|
|
LastCheckAt time.Time `json:"last_check_at,omitempty"`
|
|
LastCheckStatus string `json:"last_check_status,omitempty"`
|
|
LockState string `json:"lock_state"` // locked|unlocked
|
|
}
|
|
|
|
// Schedule is the agent-facing view of a Schedule row. (Server-side
|
|
// CRUD shapes live in the http handlers; this is what gets pushed.)
|
|
type Schedule struct {
|
|
ID string `json:"id"`
|
|
Kind JobKind `json:"kind"`
|
|
CronExpr string `json:"cron_expr"`
|
|
Paths []string `json:"paths,omitempty"`
|
|
Excludes []string `json:"excludes,omitempty"`
|
|
Tags []string `json:"tags,omitempty"`
|
|
RetentionPolicy json.RawMessage `json:"retention_policy,omitempty"`
|
|
Options json.RawMessage `json:"options,omitempty"`
|
|
PreHook string `json:"pre_hook,omitempty"`
|
|
PostHook string `json:"post_hook,omitempty"`
|
|
Enabled bool `json:"enabled"`
|
|
}
|
|
|
|
// ScheduleSetPayload — server pushes the full canonical schedule list
|
|
// for a host. Agent reconciles its local cron and replies with
|
|
// ScheduleAckPayload carrying the same Version.
|
|
type ScheduleSetPayload struct {
|
|
Version int64 `json:"version"`
|
|
Schedules []Schedule `json:"schedules"`
|
|
}
|
|
|
|
// ScheduleAckPayload — agent confirms it has applied a given version.
|
|
type ScheduleAckPayload struct {
|
|
Version int64 `json:"version"`
|
|
AppliedAt time.Time `json:"applied_at"`
|
|
}
|
|
|
|
// ConfigUpdatePayload — server pushes per-host config (currently just
|
|
// repo connection details). Empty fields mean "leave existing alone";
|
|
// to clear something, send an explicit zero value.
|
|
type ConfigUpdatePayload struct {
|
|
RepoURL string `json:"repo_url,omitempty"`
|
|
RepoPassword string `json:"repo_password,omitempty"` // sensitive
|
|
RepoUsername string `json:"repo_username,omitempty"`
|
|
RepoCredential string `json:"repo_credential,omitempty"` // sensitive (for rest server basic auth)
|
|
HookShell string `json:"hook_shell,omitempty"`
|
|
}
|
|
|
|
// AgentUpdateAvailablePayload — informational only; the agent does
|
|
// NOT self-update. See spec.md §4.2 for the package-manager-based
|
|
// update model.
|
|
type AgentUpdateAvailablePayload struct {
|
|
LatestVersion string `json:"latest_version"`
|
|
PackageURL string `json:"package_url"` // apt repo / choco source
|
|
Changelog string `json:"changelog,omitempty"`
|
|
}
|