From 814e49cb93ef0956acae1caca56b0e1b1ab0abfd Mon Sep 17 00:00:00 2001 From: Steve Cliff Date: Tue, 5 May 2026 12:04:09 +0100 Subject: [PATCH] =?UTF-8?q?spec:=20P4-05=20=E2=80=94=20OIDC=20login=20desi?= =?UTF-8?q?gn?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Brainstormed shape locked: JIT-provision local rows on first OIDC sign-in (auth_source='oidc'), YAML-only config (no UI), 'roles' claim with deny-on-no-match default, preferred_username with email fallback, refuse on local-user collision, single provider, login page shows SSO above password (break-glass), front-channel logout only, role re-evaluation at login only. Migration 0019: users.auth_source + users.oidc_subject (partial unique index), sessions.id_token (for end_session id_token_hint), oidc_state table for the OAuth round-trip state, swept on the existing alert-engine tick. Composes with the user-management work from P4-03/04: admin can disable OIDC users like local; last-admin guard catches IdP role- mapping mistakes; audit trail covers JIT-provision via user.created with auth_source payload + new user.oidc_login / user.oidc_login_blocked actions. Out of scope (deferred): back-channel logout, multi-provider, UI-driven role mapping, refresh tokens / mid-session re-eval. --- .../specs/2026-05-05-p4-05-oidc-design.md | 212 ++++++++++++++++++ 1 file changed, 212 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-05-p4-05-oidc-design.md diff --git a/docs/superpowers/specs/2026-05-05-p4-05-oidc-design.md b/docs/superpowers/specs/2026-05-05-p4-05-oidc-design.md new file mode 100644 index 0000000..eb9f9b7 --- /dev/null +++ b/docs/superpowers/specs/2026-05-05-p4-05-oidc-design.md @@ -0,0 +1,212 @@ +# P4-05 — OIDC Login Design + +> **Date:** 2026-05-05 +> **Status:** brainstorm complete; ready for plan +> **Closes:** P4-05 (OIDC login) + +## Goal + +Wire OpenID Connect authentication as a sign-in path alongside the existing local-user system, so a deployment that already has an IdP (Authelia, Authentik, Keycloak, Okta, Auth0, etc.) can use it for restic-manager logins. + +## Architecture + +OIDC sits on top of the local-user system rather than replacing it. The first time a user signs in via OIDC the server **just-in-time provisions** a local user row marked `auth_source='oidc'`, with role derived from the IdP's `roles` claim. Subsequent sign-ins look up the same row by stable `oidc_subject` and refresh role + email from the latest claims. Once the row exists it behaves like any other local user — admin can disable it, force-logout, see it in audit logs, etc. — except password-login is rejected because there's no password. + +The Authorization Code flow (with PKCE) is implemented against the discovered well-known config of a single configured issuer. Front-channel logout: clicking Sign out drops the local session + redirects the browser to the IdP's `end_session_endpoint` (when advertised). Back-channel logout deferred. + +## Locked decisions + +| Decision | Pick | +|---|---| +| User lifecycle | **B** — JIT-provision local rows on first OIDC login (`auth_source='oidc'`, `oidc_subject`) | +| Role mapping config | **A** — YAML/env, claim name `roles`, default = deny on no-match | +| Username source | `preferred_username`, fallback to `email` | +| Username collision with existing local user | **Refuse** with clear remediation message | +| Provider config | **Single provider** — `providers:` array can come later | +| Login page layout | SSO button **above** password form; password form labelled "or sign in with a local account" | +| OIDC users + password login | **Disabled** — `auth_source='oidc'` rows have empty `password_hash`; password form rejects them | +| Logout shape | **Front-channel only** — drop session + redirect to `end_session_endpoint` when advertised | +| Role re-evaluation | **At login only** — claims read at the OIDC callback; admin can disable mid-session locally | + +## Schema changes + +Migration 0019 — `users` extensions for OIDC bookkeeping: + +```sql +ALTER TABLE users ADD COLUMN auth_source TEXT NOT NULL DEFAULT 'local' + CHECK (auth_source IN ('local', 'oidc')); +ALTER TABLE users ADD COLUMN oidc_subject TEXT; + +CREATE UNIQUE INDEX users_oidc_subject ON users(oidc_subject) + WHERE oidc_subject IS NOT NULL; +``` + +Both column-level ALTERs (CLAUDE.md preference). The unique partial index defends the JIT-lookup invariant (one row per IdP subject) without blocking multiple rows with NULL oidc_subject (the local users). + +## Configuration + +```yaml +# server config — extend existing config struct +oidc: + issuer: https://auth.example.com # well-known config discovered from this + client_id: restic-manager + client_secret: ${RM_OIDC_CLIENT_SECRET} # or via _FILE + scopes: [openid, profile, email, roles] # 'roles' usually means a custom scope + role_claim: roles # default if absent + role_mapping: + rm-admins: admin + rm-operators: operator + rm-viewers: viewer + # Optional — auto-derived from BaseURL if absent. + redirect_url: https://rm.example.com/auth/oidc/callback +``` + +Env-var overrides: `RM_OIDC_ISSUER`, `RM_OIDC_CLIENT_ID`, `RM_OIDC_CLIENT_SECRET`, `RM_OIDC_CLIENT_SECRET_FILE`. Mapping is YAML-only (env doesn't fit a multi-key string→string map cleanly). + +When `oidc.issuer` is empty or missing, OIDC is disabled (current behaviour). No restart-toggle UI; this is a deploy-time setting. + +## Auth flow + +### Login start + +`GET /auth/oidc/login` — only mounted when OIDC is configured. + +1. Generate `state` (32 random bytes, base64) and `code_verifier` (64 random bytes, base64); compute `code_challenge = base64(sha256(code_verifier))`. +2. Store `(state, code_verifier, created_at)` in a new ephemeral table (or in memory with a 5-minute TTL — see "trade-off" below). +3. Redirect to `?response_type=code&client_id=...&redirect_uri=...&scope=...&state=...&code_challenge=...&code_challenge_method=S256`. + +### Callback + +`GET /auth/oidc/callback?code=...&state=...` — also OIDC-only mount. + +1. Validate `state` against the stored value (one-shot — delete row on read). Reject if missing/expired/already used. +2. Exchange `code` + `code_verifier` for tokens at `token_endpoint`. +3. Validate the `id_token` JWT: signature against the JWKS endpoint, `iss`, `aud`, `exp`, `iat`, `nonce` (if used). +4. Extract `sub`, `preferred_username`, `email`, and the configured `role_claim` (default `roles`). +5. Pick username: `preferred_username` if non-empty, else `email`. Lowercase / trim per the existing local-user rules. +6. Pick role: first match in `role_mapping` against the array of role-claim values. **No match → deny with a clear error page**, no row created. +7. Look up user by `oidc_subject`. Three cases: + - **Found** — refresh `email`, `role`, `last_login_at`. Don't touch `username` (changing it would break audit trails; if the IdP changes the username, that's an operator concern). Log `user.oidc_login`. + - **Not found, username free** — INSERT row with `auth_source='oidc'`, `oidc_subject=`, `password_hash=''`, `must_change_password=0`. Log `user.created` with payload `{"auth_source":"oidc"}` + `user.oidc_login`. + - **Not found, username taken by a local user** — render an error page: "This OIDC user (``) wants to sign in as `alice`, but a local user with that name already exists. Ask your administrator to either rename / remove the local user, or exclude this user from the OIDC mapping." 403, no row created. Log `user.oidc_login_blocked`. +8. Drop a session cookie + `MarkUserLogin` (the existing helper). +9. Redirect to `/`. + +### Logout + +`POST /logout` (existing handler) — augmented: + +1. Look up the session before deletion (we need the user row to know if they're an OIDC user). +2. Delete the session as today. +3. If the user is `auth_source='oidc'` AND the discovered `end_session_endpoint` is non-empty → 303 to `?id_token_hint=&post_logout_redirect_uri=/login`. Otherwise → existing 303 to `/login`. + +We need to keep the latest `id_token` per session to drive `id_token_hint`. Stash it in a new `sessions.id_token TEXT` column (one column-level ALTER on migration 0019 alongside the user columns), populated only for OIDC sessions. + +## State table + +Two reasonable shapes for the short-lived state used during the OAuth round-trip: + +- **In-memory map** with a 5-minute TTL sweeper. Simpler, but multi-process deployments lose it (no multi-process today, but Phase 5 OSS readiness might add). +- **`oidc_state` table** — `(state_hash PK, code_verifier, created_at)`, swept on the same 60s alert-engine tick that already handles setup-token cleanup. + +I'll go with the **table**. Costs ~3 lines in the existing cleanup tick, behaves correctly under restarts, and survives a future scale-out. Migration 0019 includes: + +```sql +CREATE TABLE oidc_state ( + state_hash TEXT PRIMARY KEY, -- sha256(state) hex; raw state never persisted + code_verifier TEXT NOT NULL, + created_at TEXT NOT NULL +); +CREATE INDEX oidc_state_created ON oidc_state(created_at); +``` + +## Login-page UI + +`/login` template branches based on `view.OIDCEnabled`: + +- **OIDC off** → current layout (just the password form). +- **OIDC on** → an `Sign in with ` button at the top, then a faint divider line, then the existing password form labelled "Or sign in with a local account". Provider name comes from a new optional config `oidc.display_name` (defaults to "SSO"). + +Failed-OIDC redirects (no role match, username collision, IdP error) land on `/login?oidc_error=` with a small banner above the buttons. + +## Audit actions + +New entries in the action vocabulary: + +- `user.oidc_login` (target_kind=user, target_id=user_id, payload `{"sub":"…"}`) +- `user.oidc_login_blocked` (target_kind=user, target_id=oidc_subject when no row was created, payload `{"username":"…", "reason":"username_taken|no_role_match|other"}`) +- `user.created` already exists; OIDC's first-time provisioning fires this with payload `{"auth_source":"oidc"}` so the audit log distinguishes admin-created from JIT-provisioned rows. + +## User-management UI changes + +Small additions, not new screens: + +- **Users list** — Status column adds a small `oidc` chip when `auth_source='oidc'` so admin can see at a glance which rows came from JIT-provisioning. Sortable by auth_source via the same sortable-headers pattern (lands as a small follow-up if anyone asks; out of scope for v1). +- **Add user form** — disabled when OIDC is the only auth path, with a hint: "User provisioning is handled by your OIDC provider; users appear here on first sign-in." Configurable later via a `oidc.disable_local_users` flag if that becomes a real ask. Out of scope for v1; both paths stay open. +- **Edit user form** — when `auth_source='oidc'`: + - Username field disabled (changing it would just be undone on next OIDC login) + - Role dropdown disabled, with a hint: "Role is managed by your OIDC provider's `roles` claim mapping. Edit the mapping in server config to change." + - Email field disabled (refreshed from IdP on each login) + - **Disable / Enable / Force logout** still work — disabling an OIDC user kicks their session and rejects future OIDC logins ("user disabled by administrator") + - **Regenerate setup link** hidden — there's no setup token for OIDC users +- **Login UI** — password form rejects users with `auth_source='oidc'` ("This account uses single sign-on. Click the SSO button above.") + +## Middleware / handler changes + +- **Routes**: new public-band entries `GET /auth/oidc/login`, `GET /auth/oidc/callback`. Skipped entirely when OIDC isn't configured (`s.deps.OIDC == nil`). +- **Logout handler** augmented to fetch the user row + decide between local logout (303 → `/login`) and OIDC logout (303 → `end_session_endpoint`). +- **Login handler** rejects `auth_source='oidc'` users with the SSO-prompt error. +- **Last-admin guard** — already covers OIDC users naturally because they live in the `users` table. The role-from-claims path could create a "every admin gets demoted to operator" situation if the IdP's claim mapping is wrong; the guard rejects that demotion at the moment it'd be applied (returns the user to the login page with `oidc_error=role_change_blocked` and audit entry; admin must fix the mapping or promote a local admin first). + +## Implementation outline + +1. **Schema** — migration 0019 (users.auth_source + oidc_subject, sessions.id_token, oidc_state table) +2. **Config** — extend `internal/server/config` with the OIDC block + env-var overrides; load JWKS lazily +3. **Discovery + JWKS** — small helper that fetches `/.well-known/openid-configuration` once at startup, caches `authorization_endpoint`, `token_endpoint`, `end_session_endpoint`, `jwks_uri`. JWKS refreshed on first failed verification. +4. **Login start handler** — `/auth/oidc/login` +5. **Callback handler** — `/auth/oidc/callback`, with the four claim-resolution branches +6. **Logout handler augmentation** — branch on `auth_source` +7. **Login form rejection** — local-user password form rejects OIDC accounts +8. **State cleanup** — extend the alert engine's existing cleanup tick +9. **UI** — `oidc` chip on users list, disabled fields on edit-form for OIDC users, login page SSO button + error banner +10. **Tests** — config parse tests; happy-path callback test using a fake IdP (httptest server with a hand-rolled discovery doc + JWKS); username-collision test; no-role-match test; logout test +11. **Sweep** — full Playwright walk against an actual IdP (Authelia in a Docker container) — admin gets in via OIDC, role mapping works, logout redirects through IdP, OIDC user can't password-login + +## Test strategy + +The IdP is the hard part to test cleanly. Two layers: + +- **Unit / integration tests** use a stub OIDC provider built into the test harness — `httptest.Server` exposing `.well-known/openid-configuration`, a token endpoint that signs minted JWTs with a test ECDSA key, and a JWKS endpoint serving the public key. This covers every code path without a real IdP. Pattern: each test mints its own claims and runs the callback against the stub. +- **Smoke env** runs against a real Authelia container (existing `compose.smoke.yaml`-style file or one-liner `docker run`) for the final sweep — confirms the discovery doc isn't being misread, real JWT verification works, real `end_session_endpoint` redirect works. + +## Out of scope (deferred) + +- **Multi-provider** support (`providers:` array) +- **Back-channel logout** (RFC 8138) — schema isn't blocked from adding it later +- **UI-driven role mapping** (config-only in v1) +- **Refresh tokens / mid-session role re-evaluation** — login-only refresh in v1 +- **`oidc.disable_local_users`** flag — both paths stay open in v1 +- **OIDC user dashboard chip / badges** beyond the small `oidc` indicator on the users list +- **Per-user "auth source" filter on the users list** — sortable headers cover most of the use case + +## Risks / gotchas + +- **JWKS key rotation** — refresh on first failed verification is the standard fix; document the cache TTL (1h) in the config block. +- **Clock skew** — accept `iat`/`exp` with a 60s leeway; matches what most OIDC libraries do. +- **End-session 404 / not advertised** — degrade gracefully; just drop the session and 303 to `/login`. Don't 500 the logout because the IdP doesn't implement RP-initiated logout. +- **Username changes at the IdP** — silently keep the local username (matches our locked decision: subject is the stable key, username is display-only). Document. +- **Role claim is sometimes a string, sometimes an array, sometimes a comma-separated string** depending on IdP — normalise into `[]string` before mapping. Authelia/Keycloak emit arrays; some custom setups emit strings; handle both. +- **Password-form bypass for OIDC users via /api/auth/login (JSON)** — same rejection rule applies, not just the HTML form. + +## Acceptance + +- [ ] An OIDC user with `roles: ["rm-admins"]` can sign in, becomes an admin, is visible in `/settings/users` with an `oidc` chip +- [ ] Same user signing in again resolves to the same row (no duplicate) +- [ ] Same user with `roles: ["something-else"]` is denied, lands on `/login?oidc_error=no_role_match` with a banner, no row created +- [ ] OIDC user can't password-login through `/login` or `/api/auth/login` +- [ ] Admin disables an OIDC user → next OIDC login is rejected, existing session bounced (existing disable-mid-session) +- [ ] Sign out as an OIDC user → 303 to IdP's end-session URL (when advertised); no end-session URL → 303 to `/login` +- [ ] OIDC config absent → password login works exactly as today (zero behavioural change) +- [ ] Username collision: a local `alice` exists, OIDC user with `preferred_username=alice` and a different `sub` → blocked at sign-in with the clear error page +- [ ] Last-admin guard refuses to demote the only enabled admin even if the IdP's role mapping says otherwise +- [ ] All existing tests pass; new test suite covers the four claim-resolution branches and logout