smoke env: systemd --user unit + Make targets so the dev server outlives shell tool boundaries

Spent half an evening fighting a smoke server that kept getting SIGTERM'd
mid-iteration. Root cause: backgrounded processes spawned from sandboxed
shell tool calls don't outlive the parent — even with nohup + disown.

Fix: hand the server to user-systemd as a transient unit so its lifecycle
is owned by the user's session, not by whichever bash subprocess started it.
New Make targets:

  make smoke-restart   build server + (re)launch as systemd --user unit
  make smoke-status    show unit status
  make smoke-logs      tail $HOME/smoke/server.log
  make smoke-stop      stop the unit
  make smoke-deploy    full rebuild + restage agent assets + restart

Documents the workflow in CLAUDE.md so the next session doesn't relitigate.
This commit is contained in:
2026-05-07 22:55:36 +01:00
parent 51192c3603
commit a28bda2031
2 changed files with 77 additions and 3 deletions
+23 -2
View File
@@ -81,8 +81,29 @@ RM_COOKIE_SECURE=false \
./bin/restic-manager-server >> $HOME/smoke/server.log 2>&1 &
```
A `make smoke-deploy` target that bundles all of this would be a
good follow-up.
## Smoke server: use the Make targets, not raw `nohup`
The smoke server runs as a transient `systemd --user` unit named
`restic-manager-smoke.service` so it survives any sandbox or
process-group boundary that would otherwise SIGTERM a backgrounded
process. Use the Make targets:
```
make smoke-restart # rebuild server + (re)launch as systemd --user unit
make smoke-status # systemctl --user status
make smoke-logs # tail $HOME/smoke/server.log
make smoke-stop # stop the unit
make smoke-deploy # full rebuild + restage agent assets + restart
```
`./bin/restic-manager-server &` from inside a Bash tool call gets
reaped when the tool exits — don't do that. If the unit fails to
start: `systemctl --user status restic-manager-smoke` and
`$HOME/smoke/server.log` have the diagnosis.
`smoke-deploy` does NOT touch `/usr/local/bin/restic-manager-agent`
on this dev box; if your change requires the live agent here to
update, run the agent restage block above by hand.
## Migrations: prefer column-level ALTERs over table rebuilds
+54 -1
View File
@@ -24,7 +24,18 @@ TAILWIND_URL := https://github.com/tailwindlabs/tailwindcss/releases/downlo
TAILWIND_INPUT := web/styles/input.css
TAILWIND_OUTPUT := web/static/css/styles.css
.PHONY: help build server agent test test-race lint fmt tidy clean run-server run-agent docker release tailwind tailwind-watch setup hooks
.PHONY: help build server agent test test-race lint fmt tidy clean run-server run-agent docker release tailwind tailwind-watch setup hooks smoke-restart smoke-stop smoke-status smoke-logs smoke-deploy
# ---- smoke-env tooling -------------------------------------------------
# The smoke server runs as a transient user-systemd unit so it survives
# bash-tool boundaries and reboots-of-the-shell. Use `make smoke-restart`
# any time you've rebuilt the server. `make smoke-deploy` is the full
# rebuild + restage + restart workflow described in CLAUDE.md.
SMOKE_UNIT := restic-manager-smoke
SMOKE_DATA_DIR := $(HOME)/smoke/data
SMOKE_LOG_FILE := $(HOME)/smoke/server.log
SMOKE_BASE_URL := http://127.0.0.1:8080
SMOKE_LISTEN := :8080
help:
@grep -E '^[a-zA-Z_-]+:.*?## ' $(MAKEFILE_LIST) | awk 'BEGIN{FS=":.*?## "};{printf " \033[36m%-14s\033[0m %s\n",$$1,$$2}'
@@ -94,6 +105,48 @@ docker: ## Build the server Docker image
--build-arg DATE=$(DATE) \
-t $(DOCKER_IMAGE):$(DOCKER_TAG) .
smoke-restart: server ## (Re)start the smoke server as a transient user-systemd unit
@systemctl --user reset-failed $(SMOKE_UNIT) >/dev/null 2>&1 || true
@systemctl --user stop $(SMOKE_UNIT) >/dev/null 2>&1 || true
@echo "==> launching $(SMOKE_UNIT)"
systemd-run --user --unit=$(SMOKE_UNIT) \
--setenv=RM_LISTEN=$(SMOKE_LISTEN) \
--setenv=RM_DATA_DIR=$(SMOKE_DATA_DIR) \
--setenv=RM_BASE_URL=$(SMOKE_BASE_URL) \
--setenv=RM_SECRET_KEY_FILE=$(SMOKE_DATA_DIR)/secret.key \
--setenv=RM_COOKIE_SECURE=false \
--property=StandardOutput=append:$(SMOKE_LOG_FILE) \
--property=StandardError=append:$(SMOKE_LOG_FILE) \
--property=Restart=on-failure \
$(PWD)/$(SERVER_BIN)
@for i in 1 2 3 4 5; do \
curl -fsS -o /dev/null $(SMOKE_BASE_URL)/api/version 2>/dev/null && \
{ echo "==> smoke server up: $$(curl -s $(SMOKE_BASE_URL)/api/version)"; exit 0; }; \
sleep 1; \
done; \
echo "!! smoke server did not respond on $(SMOKE_BASE_URL) — check $(SMOKE_LOG_FILE)" >&2; \
systemctl --user status --no-pager $(SMOKE_UNIT) || true; \
exit 1
smoke-stop: ## Stop the smoke server
systemctl --user stop $(SMOKE_UNIT) || true
@systemctl --user reset-failed $(SMOKE_UNIT) >/dev/null 2>&1 || true
smoke-status: ## Show status of the smoke server
@systemctl --user status --no-pager $(SMOKE_UNIT) 2>&1 | head -20 || true
smoke-logs: ## Tail the smoke server log
tail -50 $(SMOKE_LOG_FILE)
smoke-deploy: build smoke-restart ## Rebuild + restage agent into smoke + restart server (full per-CLAUDE.md cycle)
@echo "==> restaging agent + install assets into $(SMOKE_DATA_DIR)"
cp $(AGENT_BIN) $(SMOKE_DATA_DIR)/agent-binaries/restic-manager-agent-linux-amd64
cp deploy/install/install.sh $(SMOKE_DATA_DIR)/install/install.sh
cp deploy/install/install.ps1 $(SMOKE_DATA_DIR)/install/install.ps1
cp deploy/install/restic-manager-agent.service $(SMOKE_DATA_DIR)/install/restic-manager-agent.service
@echo "==> NOTE: this dev box's installed agent at /usr/local/bin/restic-manager-agent is NOT updated by this target."
@echo " Run the agent restage block in CLAUDE.md if your change touches agent code or the unit file."
release: ## Cross-compile for all supported platforms
@mkdir -p $(BIN_DIR)
@for target in linux/amd64 linux/arm64 windows/amd64; do \