c8ead66f08
Cohesive batch from a smoke-test session against a real rest-server.
Themed bullets:
* Agent runs as root, sandboxed via systemd. CapabilityBoundingSet
drops to CAP_DAC_READ_SEARCH + restore caps; ProtectSystem=strict
with ReadWritePaths confined to /etc + /var/lib/restic-manager;
NoNewPrivileges blocks escalation. Install script no longer
creates a service user. spec.md §4.2 / §14.1 / §14.3 explain the
rationale (matches UrBackup / Veeam / Bareos defaults; trying to
back up "everything" as an unprivileged user creates silent skips
on /home, /root, /var/lib/* with no upside vs the threat model
the agent already implies).
* Init-repo end-to-end. New JobKind="init" wired through agent
runner, restic.Env.RunInit, server dispatcher, and a UI button
(red "Initialise repo" in the run-now panel). hosts.repo_initialised_at
flips on init success, on backup success, or on a non-empty
snapshots.report. The "Run now" / "Init" / "Retry" branching now
drives both the dashboard host row and the host-detail panel.
Migrations 0004 (column), 0005 (jobs.kind CHECK widened — using
the safe create-new-then-rename pattern; first version corrupted
job_logs.job_id FK), 0006 (cleans up job_logs FK on already-
affected DBs).
* rest-server creds embedded at exec time only. restic.Env gains
RepoUsername; mergeRestCreds() builds the user:pass@-prefixed URL
inside envSlice() and never assigns it back to the struct, so
nothing slog-able ever sees the cleartext form. RedactURL helper
for any future surface that needs to log a URL safely. Both
helpers tested.
* Add-host UX. Repo password is now optional — server mints a
24-byte URL-safe random one and surfaces it once, alongside an
htpasswd snippet ("echo PASS | htpasswd -B -i ... USERNAME") so
the operator pastes one command on the rest-server host and one
on the endpoint. Result page also links the install snippet at
/install/install.sh (was /install.sh — 404'd before) and pipes
to bash (not sh — script uses set -o pipefail and other
bashisms; on Debian/Ubuntu sh is dash).
* Late-subscriber race in JobHub. A fast-failing job could finish
(DB write + Broadcast) before the browser's HX-Redirect → page
load → WS-connect path completed, so the JS sat forever waiting
on a job.finished that already passed. JobHub split into
Register + Send + Run; handleJobStream now subscribes first,
re-fetches the job, and sends a synthetic job.finished if the
state is already terminal.
* HTMX error visibility. New toast partial listens to
htmx:responseError and surfaces the response body as a
bottom-right toast — every server-side validation error now
becomes visible without per-handler JS wiring. Also handles
custom rm:toast events for future server-pushed notifications
via the HX-Trigger header. Themed via existing CSS vars.
* Dashboard rows are now whole-row clickable to host detail
(CSS card-link pattern: absolute-positioned anchor + .row-action
z-index restoration so the action button stays clickable).
"View →" on a running job links to /jobs/<id> rather than
/hosts/<id> since the row click already covers the host page.
* "Run first" / "Run first backup" → "Run now" everywhere for
consistency.
* runbook (docs/e2e-smoke.md) updated — live-log streaming step
now reflects P1-26; mentions the browser-driven Run-now flow.
* _diag/dump-creds — moved out of cmd/ so go build doesn't pick
it up; .gitignore now excludes /_diag/ entirely.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
178 lines
5.9 KiB
Go
178 lines
5.9 KiB
Go
// Package http hosts the chi-based REST handlers for the control
|
|
// plane. The Server type owns the router, the handlers, and the
|
|
// graceful-shutdown lifecycle.
|
|
package http
|
|
|
|
import (
|
|
"context"
|
|
"errors"
|
|
stdhttp "net/http"
|
|
"time"
|
|
|
|
"github.com/go-chi/chi/v5"
|
|
"github.com/go-chi/chi/v5/middleware"
|
|
|
|
"gitea.dcglab.co.uk/steve/restic-manager/internal/crypto"
|
|
"gitea.dcglab.co.uk/steve/restic-manager/internal/server/config"
|
|
"gitea.dcglab.co.uk/steve/restic-manager/internal/server/ui"
|
|
"gitea.dcglab.co.uk/steve/restic-manager/internal/server/ws"
|
|
"gitea.dcglab.co.uk/steve/restic-manager/internal/store"
|
|
)
|
|
|
|
// Deps bundles every collaborator the HTTP server depends on. Wired up
|
|
// in cmd/server; tests pass a pared-down Deps with fakes.
|
|
type Deps struct {
|
|
Cfg config.Config
|
|
Store *store.Store
|
|
AEAD *crypto.AEAD
|
|
Hub *ws.Hub
|
|
JobHub *ws.JobHub
|
|
UI *ui.Renderer
|
|
// Version is the binary's build version, surfaced in the chrome.
|
|
// Empty falls back to "dev".
|
|
Version string
|
|
// BootstrapToken (optional, populated only on first run) is the raw
|
|
// admin-bootstrap token printed in the server logs. While set, the
|
|
// /bootstrap endpoint accepts it to create the first admin user.
|
|
BootstrapToken string
|
|
}
|
|
|
|
// Server is the running HTTP server.
|
|
type Server struct {
|
|
srv *stdhttp.Server
|
|
deps Deps
|
|
}
|
|
|
|
// New builds a configured but not-yet-started server.
|
|
func New(deps Deps) *Server {
|
|
r := chi.NewRouter()
|
|
|
|
// Built-in middleware: request ID for log correlation, recovery
|
|
// (don't crash the process on a panic in a handler), realIP iff a
|
|
// trusted proxy is configured.
|
|
r.Use(middleware.RequestID)
|
|
r.Use(middleware.Recoverer)
|
|
r.Use(requestLogger)
|
|
|
|
// Health endpoint — unauthenticated, no audit, deliberately cheap.
|
|
r.Get("/healthz", func(w stdhttp.ResponseWriter, _ *stdhttp.Request) {
|
|
w.WriteHeader(stdhttp.StatusNoContent)
|
|
})
|
|
|
|
s := &Server{deps: deps}
|
|
s.routes(r)
|
|
|
|
s.srv = &stdhttp.Server{
|
|
Addr: deps.Cfg.Listen,
|
|
Handler: r,
|
|
ReadHeaderTimeout: 10 * time.Second,
|
|
IdleTimeout: 60 * time.Second,
|
|
// Long write timeout — WS upgrades and live log streams need it.
|
|
WriteTimeout: 0,
|
|
}
|
|
return s
|
|
}
|
|
|
|
// routes wires the API tree. Subtrees live in this file by area so a
|
|
// reader can scan one place and see the surface.
|
|
func (s *Server) routes(r chi.Router) {
|
|
r.Route("/api", func(r chi.Router) {
|
|
r.Post("/auth/login", s.handleLogin)
|
|
r.Post("/auth/logout", s.handleLogout)
|
|
r.Post("/bootstrap", s.handleBootstrap)
|
|
|
|
// Agent enrollment (open endpoint — token is the credential).
|
|
r.Post("/agents/enroll", s.handleAgentEnroll)
|
|
|
|
// Operator → server (authenticated). Spec.md §6.1's
|
|
// /hosts/{id}/enrollment-token (regenerate) lands when the
|
|
// host page can call it; for now just the create endpoint.
|
|
r.Post("/enrollment-tokens", s.handleCreateEnrollmentToken)
|
|
|
|
// Fleet read endpoints — back the dashboard.
|
|
r.Get("/hosts", s.handleListHosts)
|
|
r.Get("/fleet/summary", s.handleFleetSummary)
|
|
|
|
// Run-now: dispatch a job to a host's agent.
|
|
r.Post("/hosts/{id}/jobs", s.handleRunNow)
|
|
|
|
// Snapshot projection (refreshed by the agent after each backup).
|
|
r.Get("/hosts/{id}/snapshots", s.handleListHostSnapshots)
|
|
|
|
// Repo credentials — operator can edit after enrollment. The
|
|
// initial set is supplied at token-mint time (see enrollment.go).
|
|
// GET returns a redacted view (URL, username, has_password).
|
|
r.Get("/hosts/{id}/repo-credentials", s.handleGetHostCredentials)
|
|
r.Put("/hosts/{id}/repo-credentials", s.handleSetHostCredentials)
|
|
})
|
|
|
|
// Agent ↔ server WebSocket. Bearer-authenticated inside the handler.
|
|
if s.deps.Hub != nil {
|
|
r.Mount("/ws/agent", ws.AgentHandler(ws.HandlerDeps{
|
|
Hub: s.deps.Hub,
|
|
Store: s.deps.Store,
|
|
JobHub: s.deps.JobHub,
|
|
OnHello: s.onAgentHello,
|
|
}))
|
|
}
|
|
|
|
// Agent binaries + install scripts. Open endpoints — content is
|
|
// unprivileged on its own, gating happens via the enrollment
|
|
// token. See agent_assets.go.
|
|
r.Get("/agent/binary", s.handleAgentBinary)
|
|
r.Get("/install/*", s.handleInstallAsset)
|
|
|
|
// Static assets (Tailwind CSS bundle, future favicon).
|
|
r.Mount("/static/", staticHandler())
|
|
|
|
// HTML UI. The renderer is required — fail loud if the binary
|
|
// was built without templates (impossible in practice given
|
|
// embed, but guards bad test wiring).
|
|
if s.deps.UI != nil {
|
|
r.Get("/", s.handleUIDashboard)
|
|
r.Get("/login", s.handleUILoginGet)
|
|
r.Post("/login", s.handleUILoginPost)
|
|
r.Post("/logout", s.handleUILogoutPost)
|
|
// HTMX action endpoint for "Run now" buttons on the dashboard.
|
|
r.Post("/hosts/{id}/run-backup", s.handleUIRunBackup)
|
|
// HTMX action endpoint for the red "Initialise repo" button
|
|
// shown in the run-now panel until the repo is confirmed init'd.
|
|
r.Post("/hosts/{id}/init-repo", s.handleUIInitRepo)
|
|
// Add host flow.
|
|
r.Get("/hosts/new", s.handleUIAddHostGet)
|
|
r.Post("/hosts/new", s.handleUIAddHostPost)
|
|
// Host detail (Snapshots tab is the default).
|
|
r.Get("/hosts/{id}", s.handleUIHostDetail)
|
|
// Live job log.
|
|
r.Get("/jobs/{id}", s.handleUIJobDetail)
|
|
}
|
|
|
|
// Browser job-log stream (separate from /ws/agent so the auth
|
|
// layer is session-cookie not bearer). Mounted regardless of
|
|
// whether the UI is up — JSON callers may also subscribe.
|
|
if s.deps.JobHub != nil {
|
|
r.Get("/api/jobs/{id}/stream", s.handleJobStream)
|
|
}
|
|
}
|
|
|
|
// Start begins listening. Blocks until ListenAndServe returns
|
|
// (typically only on Shutdown). The server is HTTP-only by design;
|
|
// production deployments terminate TLS at a reverse proxy in front.
|
|
func (s *Server) Start() error {
|
|
err := s.srv.ListenAndServe()
|
|
if errors.Is(err, stdhttp.ErrServerClosed) {
|
|
return nil
|
|
}
|
|
return err
|
|
}
|
|
|
|
// Shutdown stops accepting new connections and waits up to ctx.Deadline
|
|
// for in-flight handlers to finish.
|
|
func (s *Server) Shutdown(ctx context.Context) error {
|
|
return s.srv.Shutdown(ctx)
|
|
}
|
|
|
|
// Addr returns the configured listen address. Useful in tests when
|
|
// the caller passes :0 to get a random port.
|
|
func (s *Server) Addr() string { return s.srv.Addr }
|