Bite-sized TDD tasks across 7 slices (A schema, B config, C OIDC
client core + stub IdP, D login + callback, E logout + local-login
rejection, F UI, G wiring + Authelia sweep). Each task is one
commit with concrete code blocks and test cases — no placeholders.
Refs spec at docs/superpowers/specs/2026-05-05-p4-05-oidc-design.md.
Authelia bundle for the sweep stashed at /tmp/rm-smoke/oidc.env.
Confirmed claim name from the lab IdP is 'groups' (not 'roles' as
the original spec assumed). Default the role_claim config field to
'groups' which also matches Keycloak and Authentik out of the box.
Add a 'display_name' field so the SSO button can read 'Sign in with
Authelia' rather than the generic 'SSO'.
Two new gotchas captured:
- Authelia 4.39+ 'sub' is an opaque UUID, not username — the
locked design already keys on sub + reads preferred_username
for display, so this is just documentation.
- end_session_endpoint isn't always published (Authelia config-
dependent); the locked logout flow already degrades cleanly.
Brainstormed shape locked: JIT-provision local rows on first OIDC
sign-in (auth_source='oidc'), YAML-only config (no UI), 'roles'
claim with deny-on-no-match default, preferred_username with email
fallback, refuse on local-user collision, single provider, login
page shows SSO above password (break-glass), front-channel logout
only, role re-evaluation at login only.
Migration 0019: users.auth_source + users.oidc_subject (partial
unique index), sessions.id_token (for end_session id_token_hint),
oidc_state table for the OAuth round-trip state, swept on the
existing alert-engine tick.
Composes with the user-management work from P4-03/04: admin can
disable OIDC users like local; last-admin guard catches IdP role-
mapping mistakes; audit trail covers JIT-provision via
user.created with auth_source payload + new user.oidc_login /
user.oidc_login_blocked actions.
Out of scope (deferred): back-channel logout, multi-provider,
UI-driven role mapping, refresh tokens / mid-session re-eval.
Pull the operator-experience polish out of Phase 4 so a working v1
ships sooner. Phase 4 keeps RBAC + user mgmt (already done), OIDC,
and host tags. Deferred items renumbered as P6-01..P6-05:
P4-01 → P6-01 apt + Chocolatey update delivery
P4-02 → P6-02 agent-version-behind-server tracking on dashboard
P4-06 → P6-03 repo size trend graphs
P4-08 → P6-04 Prometheus /metrics endpoint
P4-09 → P6-05 Grafana dashboard JSON + integration docs
None of these gate getting the system into production. They land
after Phase 5 (OSS readiness) on the new Phase 6.
Phase 4 remaining: P4-05 (OIDC login) + P4-07 (per-host tags +
dashboard filtering).
Live Playwright + curl sweep on the smoke env exercised the full
user-management lifecycle:
admin add user → setup link generated → curl-as-new-user fetches
/setup (200, username on page) → POSTs password → 303 to / with
Set-Cookie → 200 on dashboard, 200 on /settings/account,
**403 on /settings/users** (admin-only) → admin disables → next
request is **401** + session row count drops to 0 → audit log
reflects user.created + user.setup_completed.
Three-role middleware enforces band gates; admin is fail-closed
default. Setup tokens are sha256-hashed at rest with 1h expiry;
expired tokens are swept on the alert engine's 60s tick. Last-admin
guard rejects disable + demote of the only enabled admin. Self-
service password change at /settings/account is reachable by every
role.
Adds GET/POST handlers for /settings/account in the viewer band
(any authenticated user), account.html template with current-password
field suppressed when must_change_password is set, and audits the
change via AppendAudit.
Adds handleUIUserNewGet, handleUIUserNewPost, handleUIUserSetupLinkGet
to ui_users.go; creates web/templates/pages/user_edit.html (multi-mode
new/edit/setup-link); wires three routes in the admin band of server.go.
Replaces the 501 stub with the full handler: validates the token and
password, hashes and stores the password, deletes the setup token,
mints an 8-hour session cookie, appends a user.setup_completed audit
entry, and redirects to /. Adds TestSetupPostHappyPath covering the
full round-trip including normal-login verification after setup.
Routes are now structured into Public / Viewer / Operator / Admin bands
using requireRole middleware. Job log stream and download moved into the
Viewer band. healthz moved from New() into routes() with the other
public endpoints.
Bite-sized TDD tasks across 7 slices (A schema, B middleware,
C session re-validation, D setup-token flow, E user CRUD API,
F UI, G wiring + sweep). Each task is one commit with concrete
code blocks and test cases — no placeholders.
Refs spec at docs/superpowers/specs/2026-05-05-p4-03-04-rbac-user-mgmt-design.md.
Brainstormed shape locked: chi route-group middleware, fail-closed
admin default; setup-token flow with 1h single-use tokens
(sha256-hashed at rest, raw shown to admin once); disable-only user
lifecycle with last-admin guard; self-service /settings/account
password change for every role; email field on users (metadata
v1); session re-validation on every authenticated request so
disable / role change land immediately.
Locked decisions captured in §Role taxonomy, §Schema changes,
§Setup-token flow, §RBAC enforcement, §Last-admin self-protection.
Deferred items in §Out of scope (OIDC, SMTP email-the-link,
hard delete, lockout).
Migrations 0017 (users extensions) + 0018 (user_setup_tokens)
both column-level ALTERs per CLAUDE.md preference.
Test job was wall-clocked by `internal/server/http` (~156s on the
self-hosted runner under -race). Two changes here cut that:
1. Matrix-shard the test job by package group: server-http, store,
and "rest" (everything else, computed via `go list | grep -v`).
Each shard runs on its own runner so the heavy package isn't
CPU-starved by siblings.
2. `auth.HashPassword` drops to cheap argon2id params (8 KiB / 1
iter / 1 lane) when `testing.Testing()` returns true. Production
params are unchanged. VerifyPassword reads params from the
encoded hash so cheap-params hashes verify identically — no test
call sites need to change.