Files
restic-manager/docs/superpowers/specs/2026-05-05-p5-03-docker-only-release.md
T
steve 7cc17813a9 p5-03: docker-only release path (drop goreleaser)
Single public deliverable per tag: a multi-arch server image, with
cross-compiled agent binaries + install scripts + the systemd unit
baked under /opt/restic-manager/dist/. The /agent/binary and
/install/* handlers fall back from <DataDir>/... to that read-only
path so a fresh container Just Works without first-run staging;
operators can still drop a custom build into <DataDir>/ to override
per-host.

Architecture rationale: agent distribution already routes through
the running server, so the release surface mirrors that — there's
no second source of truth to keep in sync.

Workflow .gitea/workflows/release.yml triggers on v*.*.* tag-push
(fan-out :vX.Y.Z / :X.Y / :X, plus :latest once MAJOR>=1) and
workflow_dispatch (snapshot tag only). Pushes to the Gitea
container registry on this instance.

Both binaries grow main.commit + main.date ldflag targets. Makefile
and Dockerfile fill them; release workflow forwards from gitea.sha
plus a UTC timestamp.

Spec : docs/superpowers/specs/2026-05-05-p5-03-docker-only-release.md
Plan : docs/superpowers/plans/2026-05-05-p5-03-docker-only-release.md
2026-05-05 15:18:48 +01:00

9.5 KiB

P5-03 — Docker-only release path

Status: approved 2026-05-05. Pivots P5-03 away from goreleaser + binary archives toward a single Docker image as the only public deliverable.

Goal

One artifact per tag: the restic-manager server image, multi-arch (linux amd64 + arm64), published to the Gitea container registry of this self-hosted instance. The image bakes in cross-compiled agent binaries (linux amd64, linux arm64, windows amd64), the install scripts, and the systemd unit at a read-only image path. The running server distributes those agents and scripts via its existing /agent/binary and /install/* endpoints; operators on N hosts never download a release artifact directly.

Source builds via make build remain a first-class path for anyone who wants binaries.

Non-goals

  • Standalone binary archives (.tar.gz, .zip) on the release page.
  • darwin / windows-arm64 agent targets — neither is service-tested.
  • goreleaser. Not used.
  • cosign, SBOM, in-toto, minisign. Re-promote when we ship binaries outside an image (Phase 6 candidate).
  • GHCR / GitHub mirror. Single source of truth = Gitea.

Decisions captured (with one-line rationale)

ID Decision Why
D1 One artifact: server Docker image Architecture already routes agent distribution through the server (/agent/binary); release surface should mirror that.
D2 Trigger: tag-push (v*.*.*) plus workflow_dispatch Tag for real cuts; dispatch for snapshot iteration without polluting tag history.
D3 Build matrix: linux amd64+arm64 server image; agent cross-compiles for linux amd64+arm64+windows amd64 Mirrors the existing CI build matrix; nothing ships that hasn't been service-tested.
D4 Image-baked, separate path (/opt/restic-manager/dist/); HTTP handler reads <DataDir>/... first, falls back to /opt/... Volume stays purely operator state; image content is immutable per tag; eliminates the smoke-env "stale agent" footgun in production.
D5 Tag fan-out: vX.Y.Z, X.Y, X, latest — but latest is held back until v1.0.0 Standard rolling-minor pattern; pre-1.0 forces explicit pinning.
D6 Snapshot tag: :snapshot-<shortsha>, never moves latest Operator can never accidentally pull an unblessed build.
D7 Version embedding via -ldflags: main.version, main.commit, main.date on both cmd/server and cmd/agent Server already had version; add commit/date to both for parity and traceability.
D8 Registry: Gitea container registry on this instance, under <host>/<owner>/restic-manager One source of truth, no external creds.
D9 Integrity: a SHA256SUMS file + the manifest digest in the release notes; nothing else Image is the unit of trust; pull-by-digest is the verification primitive.
D10 P1-31 (signed binaries) stays deferred Re-promote the day we ship binaries outside an image.

Image layout

Multi-stage Dockerfile (extends today's deploy/Dockerfile.server):

build stage (golang:1.25-alpine):
    cross-compile cmd/server for $TARGETARCH (linux)
    cross-compile cmd/agent for linux/amd64
    cross-compile cmd/agent for linux/arm64
    cross-compile cmd/agent for windows/amd64
    (CGO_ENABLED=0 throughout — pure-Go SQLite)

final stage (gcr.io/distroless/static-debian12:nonroot):
    /usr/local/bin/restic-manager-server                   (matches image arch)
    /opt/restic-manager/dist/agent-binaries/
        restic-manager-agent-linux-amd64
        restic-manager-agent-linux-arm64
        restic-manager-agent-windows-amd64.exe
    /opt/restic-manager/dist/install/
        install.sh
        install.ps1
        restic-manager-agent.service

/opt/restic-manager/dist/ is owned by root:root, mode 0755 for directories, 0755 for install.sh (script must be executable when the install path uses curl ... | sh semantics) and 0644 for the unit file and install.ps1. The agent binaries are mode 0755.

<DataDir> keeps holding only operator state: restic-manager.db, secret.key, secrets.enc, audit/, tls/. Nothing the image owns gets written into the volume.

Server-side handler change

internal/server/http/agent_assets.go today reads from <DataDir>/agent-binaries/<name> and <DataDir>/install/<name>.

Change: if the file isn't present in <DataDir>, fall back to /opt/restic-manager/dist/<subpath>/<name>. The fallback path is a new server-config field defaulted to /opt/restic-manager/dist, overridable via RM_BUNDLED_ASSETS_DIR for tests and source-build deployments. If neither path resolves, return 404 (existing binary_not_published / not_found body unchanged).

This means:

  • A fresh container without any operator-staged overrides serves the baked-in agents. No first-run setup needed.
  • An operator can still drop a custom-built agent into <DataDir>/agent-binaries/ to override the image's copy (handy for pre-release agent testing without rebuilding the server image).
  • Source-build dev (bin/restic-manager-server running out of the working tree) still works exactly as today — the fallback dir is configurable, and the <DataDir> path remains the primary lookup.

Tests cover four cases: (a) DataDir hit, (b) fallback hit, (c) DataDir hit shadows fallback, (d) neither — 404.

Versioning

Both binaries grow commit and date ldflag-targets next to the existing version:

var (
    version = "dev"
    commit  = "none"
    date    = "unknown"
)

Dockerfile gains ARG VERSION, ARG COMMIT, ARG DATE, all ""-defaulted; the go build line passes them via -ldflags. The release workflow fills them from ${{ gitea.ref_name }}, ${{ gitea.sha }}, and a UTC ISO-8601 timestamp.

Snapshot builds (workflow_dispatch) compute VERSION=0.0.0-snapshot-${SHORTSHA} and tag the image as :snapshot-${SHORTSHA} only. They never touch latest or any vX.Y.Z tag.

Workflow (.gitea/workflows/release.yml)

name: Release

on:
  push:
    tags: ['v[0-9]+.[0-9]+.[0-9]+']
  workflow_dispatch:

env:
  IMAGE: gitea.dcglab.co.uk/${{ gitea.repository }}

jobs:
  image:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-qemu-action@v3
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: gitea.dcglab.co.uk
          username: ${{ gitea.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - name: compute tags
        id: meta
        run: |
          # tag-push  → :vX.Y.Z, :X.Y, :X (only :latest if X >= 1)
          # dispatch  → :snapshot-<shortsha>
          ...
      - uses: docker/build-push-action@v6
        with:
          context: .
          file: deploy/Dockerfile.server
          platforms: linux/amd64,linux/arm64
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          build-args: |
            VERSION=${{ steps.meta.outputs.version }}
            COMMIT=${{ gitea.sha }}
            DATE=${{ steps.meta.outputs.date }}

The compute tags step:

  • For push:tags: extract vMAJOR.MINOR.PATCH. Always emit :vMAJOR.MINOR.PATCH, :MAJOR.MINOR, :MAJOR. Emit :latest only when MAJOR >= 1.
  • For workflow_dispatch: emit :snapshot-<shortsha>. Nothing else.

No release-asset upload step yet — the GHCR-equivalent registry push is the deliverable. A future iteration may attach a SHA256SUMS file to a Gitea release object once tea release create is wired in; that's not in scope for the first cut.

Tests / verification

  1. go vet ./... (CLAUDE.md rule, runs locally pre-commit).
  2. go test ./internal/server/http/... covers the new fallback logic.
  3. Local manual smoke: docker build -f deploy/Dockerfile.server . produces an image; docker run --rm <image> starts the server; curl http://127.0.0.1:8080/agent/binary?os=linux&arch=amd64 serves bytes; curl http://127.0.0.1:8080/install/install.sh serves the script.
  4. Release workflow itself is exercised on first tag-push; until then, workflow_dispatch is the smoke test.

Operator-facing changes

  • README.md install snippet becomes docker run -v rm-data:/var/lib/restic-manager ... gitea.dcglab.co.uk/<owner>/restic-manager:vX.Y.Z. Pre-1.0 releases are pinned by exact tag; no :latest is published.
  • The CLAUDE.md "restage" block is dev-only (smoke env runs the server out of bin/). Production users on the image never see it.
  • RM_BUNDLED_ASSETS_DIR is documented in the server config reference (defaults to /opt/restic-manager/dist).

Risks / footguns

  • Image size growth. Three agent binaries (~15-20 MB each stripped) add ~50 MB. Acceptable; we're already shipping a distroless server. Watch the trajectory once Phase 4 alerting is in.
  • Dockerfile cross-compile multiplies build time on the runner. Pure-Go means each leg is just a go build; total stage time should stay under 60s on the self-hosted runner.
  • ARG VERSION leakage. The current Dockerfile already accepts ARG VERSION=dev; we're tightening, not loosening.
  • Operator overriding <DataDir>/agent-binaries/<name> with a stale binary will silently shadow the image's copy. Documented in the server config reference; this is a feature (lets operators hot-patch a pre-release agent) not a bug.

Out of scope (tracked for follow-up)

  • Cosign / SBOM / in-toto provenance — defer to Phase 6 with the rest of the supply-chain hardening.
  • GHCR mirror — defer until P5-01 docs site goes public.
  • tea release create integration — pending until we have something worth attaching beyond the image digest.