Files
restic-manager/internal/api/messages.go
T
steve c8ead66f08 P1 polish: agent-as-root, init-repo flow, rest creds passthrough, UX fixes
Cohesive batch from a smoke-test session against a real rest-server.
Themed bullets:

* Agent runs as root, sandboxed via systemd. CapabilityBoundingSet
  drops to CAP_DAC_READ_SEARCH + restore caps; ProtectSystem=strict
  with ReadWritePaths confined to /etc + /var/lib/restic-manager;
  NoNewPrivileges blocks escalation. Install script no longer
  creates a service user. spec.md §4.2 / §14.1 / §14.3 explain the
  rationale (matches UrBackup / Veeam / Bareos defaults; trying to
  back up "everything" as an unprivileged user creates silent skips
  on /home, /root, /var/lib/* with no upside vs the threat model
  the agent already implies).

* Init-repo end-to-end. New JobKind="init" wired through agent
  runner, restic.Env.RunInit, server dispatcher, and a UI button
  (red "Initialise repo" in the run-now panel). hosts.repo_initialised_at
  flips on init success, on backup success, or on a non-empty
  snapshots.report. The "Run now" / "Init" / "Retry" branching now
  drives both the dashboard host row and the host-detail panel.
  Migrations 0004 (column), 0005 (jobs.kind CHECK widened — using
  the safe create-new-then-rename pattern; first version corrupted
  job_logs.job_id FK), 0006 (cleans up job_logs FK on already-
  affected DBs).

* rest-server creds embedded at exec time only. restic.Env gains
  RepoUsername; mergeRestCreds() builds the user:pass@-prefixed URL
  inside envSlice() and never assigns it back to the struct, so
  nothing slog-able ever sees the cleartext form. RedactURL helper
  for any future surface that needs to log a URL safely. Both
  helpers tested.

* Add-host UX. Repo password is now optional — server mints a
  24-byte URL-safe random one and surfaces it once, alongside an
  htpasswd snippet ("echo PASS | htpasswd -B -i ... USERNAME") so
  the operator pastes one command on the rest-server host and one
  on the endpoint. Result page also links the install snippet at
  /install/install.sh (was /install.sh — 404'd before) and pipes
  to bash (not sh — script uses set -o pipefail and other
  bashisms; on Debian/Ubuntu sh is dash).

* Late-subscriber race in JobHub. A fast-failing job could finish
  (DB write + Broadcast) before the browser's HX-Redirect → page
  load → WS-connect path completed, so the JS sat forever waiting
  on a job.finished that already passed. JobHub split into
  Register + Send + Run; handleJobStream now subscribes first,
  re-fetches the job, and sends a synthetic job.finished if the
  state is already terminal.

* HTMX error visibility. New toast partial listens to
  htmx:responseError and surfaces the response body as a
  bottom-right toast — every server-side validation error now
  becomes visible without per-handler JS wiring. Also handles
  custom rm:toast events for future server-pushed notifications
  via the HX-Trigger header. Themed via existing CSS vars.

* Dashboard rows are now whole-row clickable to host detail
  (CSS card-link pattern: absolute-positioned anchor + .row-action
  z-index restoration so the action button stays clickable).
  "View →" on a running job links to /jobs/<id> rather than
  /hosts/<id> since the row click already covers the host page.

* "Run first" / "Run first backup" → "Run now" everywhere for
  consistency.

* runbook (docs/e2e-smoke.md) updated — live-log streaming step
  now reflects P1-26; mentions the browser-driven Run-now flow.

* _diag/dump-creds — moved out of cmd/ so go build doesn't pick
  it up; .gitignore now excludes /_diag/ entirely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 11:02:12 +01:00

219 lines
7.5 KiB
Go

package api
import (
"encoding/json"
"time"
)
// HostOS / HostArch are constrained string types. The store stores them
// raw, but agent metadata collection should populate them from these
// constants so we don't end up with both "linux" and "Linux" rows.
type HostOS string
const (
OSLinux HostOS = "linux"
OSWindows HostOS = "windows"
)
type HostArch string
const (
ArchAmd64 HostArch = "amd64"
ArchArm64 HostArch = "arm64"
)
// HelloPayload is the agent's first message after WS auth. The server
// upserts a Host row, marks it online, and (if protocol_version is
// acceptable) responds with a config.update + schedule.set burst.
type HelloPayload struct {
ProtocolVersion int `json:"protocol_version"`
AgentVersion string `json:"agent_version"`
ResticVersion string `json:"restic_version"`
Hostname string `json:"hostname"`
OS HostOS `json:"os"`
Arch HostArch `json:"arch"`
BootTime time.Time `json:"boot_time,omitempty"`
}
// HeartbeatPayload is sent by the agent every 30s. It carries no data
// today; presence is the signal. Future fields (load, free disk) can
// land here without bumping protocol_version.
type HeartbeatPayload struct {
SentAt time.Time `json:"sent_at"`
}
// JobKind is the operation an agent is being asked to run, or just ran.
type JobKind string
const (
JobBackup JobKind = "backup"
JobInit JobKind = "init"
JobForget JobKind = "forget"
JobPrune JobKind = "prune"
JobCheck JobKind = "check"
JobUnlock JobKind = "unlock"
)
// JobStatus is the lifecycle state of a job.
type JobStatus string
const (
JobQueued JobStatus = "queued"
JobRunning JobStatus = "running"
JobSucceeded JobStatus = "succeeded"
JobFailed JobStatus = "failed"
JobCancelled JobStatus = "cancelled"
)
// CommandRunPayload is the server → agent dispatch for a run-now job.
type CommandRunPayload struct {
JobID string `json:"job_id"`
Kind JobKind `json:"kind"`
Args []string `json:"args,omitempty"`
}
// CommandCancelPayload is the server → agent cancel signal.
type CommandCancelPayload struct {
JobID string `json:"job_id"`
}
// CommandResultPayload acks a command.run dispatch (the agent has
// accepted the job and persisted it locally) — this is *not* the job
// completion. job.finished signals that.
type CommandResultPayload struct {
JobID string `json:"job_id"`
Accepted bool `json:"accepted"`
Error string `json:"error,omitempty"`
}
// JobStartedPayload — agent has begun execution.
type JobStartedPayload struct {
JobID string `json:"job_id"`
Kind JobKind `json:"kind"`
StartedAt time.Time `json:"started_at"`
}
// JobProgressPayload — agent's periodic status while a job is running.
// Field set chosen to match what restic --json emits for `backup`;
// other kinds populate the subset that makes sense.
type JobProgressPayload struct {
JobID string `json:"job_id"`
PercentDone float64 `json:"percent_done"`
FilesDone int64 `json:"files_done"`
TotalFiles int64 `json:"total_files"`
BytesDone int64 `json:"bytes_done"`
TotalBytes int64 `json:"total_bytes"`
ETASeconds int64 `json:"eta_seconds"`
ThroughputBps int64 `json:"throughput_bps"`
}
// JobFinishedPayload — agent reports terminal state.
type JobFinishedPayload struct {
JobID string `json:"job_id"`
Status JobStatus `json:"status"`
ExitCode int `json:"exit_code"`
FinishedAt time.Time `json:"finished_at"`
Stats json.RawMessage `json:"stats,omitempty"` // restic summary blob
Error string `json:"error,omitempty"`
}
// LogStreamLine is one entry of the live job log.
type LogStreamLine struct {
JobID string `json:"job_id"`
Seq int64 `json:"seq"`
TS time.Time `json:"ts"`
Stream LogStream `json:"stream"`
Payload string `json:"payload"`
}
// LogStream identifies which channel a log line came from.
type LogStream string
const (
LogStdout LogStream = "stdout"
LogStderr LogStream = "stderr"
LogEvent LogStream = "event" // parsed restic --json event
)
// SnapshotsReportPayload — agent dumps its full snapshot list after
// each successful backup, so the server can refresh its projection.
type SnapshotsReportPayload struct {
Snapshots []Snapshot `json:"snapshots"`
}
// Snapshot is the projection mirrored from `restic snapshots --json`.
// SizeBytes / FileCount come from the embedded summary block on
// restic 0.16+; older clients leave them at zero (the UI degrades
// gracefully).
type Snapshot struct {
ID string `json:"id"` // long restic snapshot ID
ShortID string `json:"short_id"` // 8-hex-char form
Time time.Time `json:"time"`
Hostname string `json:"hostname"`
Paths []string `json:"paths"`
Tags []string `json:"tags,omitempty"`
SizeBytes int64 `json:"size_bytes,omitempty"`
FileCount int64 `json:"file_count,omitempty"`
}
// RepoStatsPayload — agent reports periodic repo health facts derived
// from `restic stats` and lock-file inspection.
type RepoStatsPayload struct {
SizeBytes int64 `json:"size_bytes"`
SnapshotCount int `json:"snapshot_count"`
DedupRatio float64 `json:"dedup_ratio"`
LastCheckAt time.Time `json:"last_check_at,omitempty"`
LastCheckStatus string `json:"last_check_status,omitempty"`
LockState string `json:"lock_state"` // locked|unlocked
}
// Schedule is the agent-facing view of a Schedule row. (Server-side
// CRUD shapes live in the http handlers; this is what gets pushed.)
type Schedule struct {
ID string `json:"id"`
Kind JobKind `json:"kind"`
CronExpr string `json:"cron_expr"`
Paths []string `json:"paths,omitempty"`
Excludes []string `json:"excludes,omitempty"`
Tags []string `json:"tags,omitempty"`
RetentionPolicy json.RawMessage `json:"retention_policy,omitempty"`
Options json.RawMessage `json:"options,omitempty"`
PreHook string `json:"pre_hook,omitempty"`
PostHook string `json:"post_hook,omitempty"`
Enabled bool `json:"enabled"`
}
// ScheduleSetPayload — server pushes the full canonical schedule list
// for a host. Agent reconciles its local cron and replies with
// ScheduleAckPayload carrying the same Version.
type ScheduleSetPayload struct {
Version int64 `json:"version"`
Schedules []Schedule `json:"schedules"`
}
// ScheduleAckPayload — agent confirms it has applied a given version.
type ScheduleAckPayload struct {
Version int64 `json:"version"`
AppliedAt time.Time `json:"applied_at"`
}
// ConfigUpdatePayload — server pushes per-host config (currently just
// repo connection details). Empty fields mean "leave existing alone";
// to clear something, send an explicit zero value.
type ConfigUpdatePayload struct {
RepoURL string `json:"repo_url,omitempty"`
RepoPassword string `json:"repo_password,omitempty"` // sensitive
RepoUsername string `json:"repo_username,omitempty"`
RepoCredential string `json:"repo_credential,omitempty"` // sensitive (for rest server basic auth)
HookShell string `json:"hook_shell,omitempty"`
}
// AgentUpdateAvailablePayload — informational only; the agent does
// NOT self-update. See spec.md §4.2 for the package-manager-based
// update model.
type AgentUpdateAvailablePayload struct {
LatestVersion string `json:"latest_version"`
PackageURL string `json:"package_url"` // apt repo / choco source
Changelog string `json:"changelog,omitempty"`
}