initial setup ready
This commit is contained in:
@@ -0,0 +1,721 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<title>restic-manager · Phase 0 wireframes</title>
|
||||
<style>
|
||||
/* Wireframe-grade only. No brand. No polish.
|
||||
Purpose: confirm information architecture & API coverage
|
||||
before locking spec.md §6.1 (REST) and §6.2 (WS) shapes. */
|
||||
|
||||
:root {
|
||||
--ink: #1a1a1a;
|
||||
--mute: #666;
|
||||
--line: #999;
|
||||
--soft: #ddd;
|
||||
--bg: #f5f5f4;
|
||||
--panel: #fff;
|
||||
--note: #b45309; /* annotations only, single accent so they read as "meta" */
|
||||
}
|
||||
|
||||
* { box-sizing: border-box; }
|
||||
|
||||
html, body {
|
||||
margin: 0;
|
||||
background: var(--bg);
|
||||
color: var(--ink);
|
||||
font: 13px/1.5 ui-monospace, "SF Mono", Menlo, Consolas, monospace;
|
||||
}
|
||||
|
||||
.page {
|
||||
max-width: 1200px;
|
||||
margin: 32px auto;
|
||||
padding: 0 24px;
|
||||
}
|
||||
|
||||
h1, h2, h3, h4 { font-weight: 600; margin: 0; }
|
||||
|
||||
.doc-header {
|
||||
border-bottom: 1px solid var(--line);
|
||||
padding-bottom: 16px;
|
||||
margin-bottom: 32px;
|
||||
}
|
||||
.doc-header h1 { font-size: 18px; }
|
||||
.doc-header p { color: var(--mute); margin: 8px 0 0; max-width: 760px; }
|
||||
|
||||
/* ---- screen frame ---- */
|
||||
.screen {
|
||||
background: var(--panel);
|
||||
border: 1px dashed var(--line);
|
||||
margin: 48px 0;
|
||||
position: relative;
|
||||
}
|
||||
.screen-label {
|
||||
position: absolute;
|
||||
top: -10px; left: 16px;
|
||||
background: var(--bg);
|
||||
padding: 0 8px;
|
||||
font-size: 11px;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.1em;
|
||||
color: var(--mute);
|
||||
}
|
||||
.screen-body { padding: 32px; }
|
||||
|
||||
/* ---- block primitives ---- */
|
||||
.box {
|
||||
border: 1px dashed var(--line);
|
||||
padding: 12px;
|
||||
background: var(--panel);
|
||||
}
|
||||
.box.solid { border-style: solid; border-color: var(--soft); }
|
||||
.box.placeholder {
|
||||
background: repeating-linear-gradient(
|
||||
45deg, transparent 0 8px, #f0efee 8px 16px
|
||||
);
|
||||
color: var(--mute);
|
||||
text-align: center;
|
||||
padding: 24px 12px;
|
||||
}
|
||||
|
||||
.row { display: flex; gap: 12px; }
|
||||
.row > * { flex: 1; }
|
||||
.stack { display: flex; flex-direction: column; gap: 12px; }
|
||||
.grid-3 { display: grid; grid-template-columns: repeat(3, 1fr); gap: 16px; }
|
||||
.grid-2 { display: grid; grid-template-columns: repeat(2, 1fr); gap: 16px; }
|
||||
|
||||
.label { color: var(--mute); font-size: 11px; text-transform: uppercase; letter-spacing: 0.05em; }
|
||||
.value { font-size: 14px; }
|
||||
.small { font-size: 11px; color: var(--mute); }
|
||||
.strong { font-weight: 600; }
|
||||
.pill { display: inline-block; border: 1px solid var(--line); padding: 1px 8px; font-size: 11px; }
|
||||
.btn { display: inline-block; border: 1px solid var(--ink); padding: 4px 12px; font-size: 12px; background: var(--panel); cursor: pointer; }
|
||||
.btn.ghost { border-color: var(--line); color: var(--mute); }
|
||||
.btn.danger { border-style: dashed; }
|
||||
|
||||
table { width: 100%; border-collapse: collapse; }
|
||||
th, td { text-align: left; padding: 6px 8px; border-bottom: 1px dashed var(--soft); font-weight: normal; }
|
||||
th { color: var(--mute); font-size: 11px; text-transform: uppercase; letter-spacing: 0.05em; }
|
||||
|
||||
/* ---- annotation callouts ----
|
||||
Every element that depends on a backend source carries a [src] tag
|
||||
so we can audit spec.md §6 coverage in one pass. */
|
||||
.src {
|
||||
display: inline-block;
|
||||
margin-left: 6px;
|
||||
padding: 1px 6px;
|
||||
font-size: 10px;
|
||||
color: var(--note);
|
||||
border: 1px solid var(--note);
|
||||
border-radius: 2px;
|
||||
vertical-align: middle;
|
||||
white-space: nowrap;
|
||||
}
|
||||
.src::before { content: ""; }
|
||||
|
||||
/* margin annotation lane */
|
||||
.annotated { display: grid; grid-template-columns: 1fr 280px; gap: 24px; }
|
||||
.ann-lane { font-size: 11px; color: var(--note); }
|
||||
.ann-lane h4 { color: var(--note); font-size: 11px; text-transform: uppercase; letter-spacing: 0.05em; margin-bottom: 8px; }
|
||||
.ann-lane ul { margin: 0; padding-left: 16px; }
|
||||
.ann-lane li { margin-bottom: 6px; line-height: 1.45; }
|
||||
|
||||
/* ---- top app chrome ---- */
|
||||
.chrome {
|
||||
border-bottom: 1px solid var(--soft);
|
||||
padding: 12px 32px;
|
||||
display: flex; align-items: center; gap: 24px;
|
||||
background: var(--panel);
|
||||
}
|
||||
.chrome .logo { font-weight: 600; }
|
||||
.chrome nav { display: flex; gap: 16px; color: var(--mute); }
|
||||
.chrome nav .active { color: var(--ink); border-bottom: 1px solid var(--ink); }
|
||||
.chrome .right { margin-left: auto; color: var(--mute); font-size: 12px; }
|
||||
|
||||
/* ---- tabs ---- */
|
||||
.tabs {
|
||||
display: flex;
|
||||
gap: 0;
|
||||
border-bottom: 1px solid var(--soft);
|
||||
margin-bottom: 24px;
|
||||
}
|
||||
.tabs a {
|
||||
padding: 8px 16px;
|
||||
border: 1px dashed var(--line);
|
||||
border-bottom: none;
|
||||
margin-right: -1px;
|
||||
color: var(--mute);
|
||||
text-decoration: none;
|
||||
background: var(--bg);
|
||||
}
|
||||
.tabs a.active {
|
||||
color: var(--ink);
|
||||
background: var(--panel);
|
||||
border-style: solid;
|
||||
border-color: var(--soft);
|
||||
}
|
||||
|
||||
/* status dots — unstyled, just outline */
|
||||
.dot { display: inline-block; width: 8px; height: 8px; border: 1px solid var(--ink); border-radius: 50%; vertical-align: middle; margin-right: 4px; }
|
||||
.dot.off { background: var(--panel); }
|
||||
.dot.ok { background: var(--ink); }
|
||||
.dot.degraded { background: repeating-linear-gradient(45deg, var(--ink) 0 2px, transparent 2px 4px); }
|
||||
|
||||
/* log stream */
|
||||
.log {
|
||||
background: #111;
|
||||
color: #ddd;
|
||||
font-size: 12px;
|
||||
line-height: 1.5;
|
||||
padding: 12px 16px;
|
||||
height: 320px;
|
||||
overflow: auto;
|
||||
border: 1px solid var(--soft);
|
||||
}
|
||||
.log .ts { color: #888; }
|
||||
.log .err { color: #f88; }
|
||||
|
||||
/* progress bar */
|
||||
.progress {
|
||||
background: var(--soft);
|
||||
height: 8px;
|
||||
position: relative;
|
||||
overflow: hidden;
|
||||
}
|
||||
.progress > span {
|
||||
display: block;
|
||||
background: var(--ink);
|
||||
height: 100%;
|
||||
width: 38%;
|
||||
}
|
||||
|
||||
/* annotations bullet style */
|
||||
details summary { cursor: pointer; color: var(--note); font-size: 11px; }
|
||||
details[open] { margin-bottom: 8px; }
|
||||
|
||||
/* TOC */
|
||||
.toc { background: var(--panel); border: 1px solid var(--soft); padding: 16px 20px; margin-bottom: 32px; }
|
||||
.toc ol { margin: 8px 0 0; padding-left: 20px; }
|
||||
.toc a { color: var(--ink); }
|
||||
|
||||
/* findings */
|
||||
.findings { border: 1px solid var(--note); padding: 16px 20px; margin-top: 48px; background: #fffbeb; }
|
||||
.findings h3 { color: var(--note); margin-bottom: 12px; }
|
||||
.findings ol { padding-left: 20px; margin: 0; }
|
||||
.findings li { margin-bottom: 8px; }
|
||||
.findings code { background: rgba(180,83,9,.08); padding: 1px 4px; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<div class="page">
|
||||
|
||||
<header class="doc-header">
|
||||
<h1>restic-manager · Phase 0 wireframes</h1>
|
||||
<p>
|
||||
Low-fidelity wireframes for Phase 1/2 screens. Purpose: confirm the data each
|
||||
screen needs before the API in spec.md §6.1 and the WS messages in §6.2 are
|
||||
locked in. Grayscale on purpose — visual design is deferred to Phase 5
|
||||
(and a focused hi-fi pass on the restore wizard in Phase 3).
|
||||
</p>
|
||||
<p>
|
||||
<span class="src">[GET /api/...]</span> tags mark REST data sources.
|
||||
<span class="src">[WS: ...]</span> tags mark WebSocket message dependencies.
|
||||
Open the “Findings” section at the bottom for spec gaps.
|
||||
</p>
|
||||
</header>
|
||||
|
||||
<nav class="toc">
|
||||
<strong>Screens</strong>
|
||||
<ol>
|
||||
<li><a href="#dashboard">Dashboard — fleet overview</a></li>
|
||||
<li><a href="#host-detail">Host detail — 5 tabs</a></li>
|
||||
<li><a href="#job-detail">Job detail — live log</a></li>
|
||||
<li><a href="#findings">Findings — gaps in spec.md §6</a></li>
|
||||
</ol>
|
||||
</nav>
|
||||
|
||||
<!-- ============================================================ -->
|
||||
<!-- SCREEN 1 · DASHBOARD -->
|
||||
<!-- ============================================================ -->
|
||||
<section id="dashboard" class="screen">
|
||||
<span class="screen-label">Screen 1 · Dashboard (/)</span>
|
||||
|
||||
<div class="chrome">
|
||||
<div class="logo">restic-manager</div>
|
||||
<nav>
|
||||
<span class="active">Dashboard</span>
|
||||
<span>Hosts</span>
|
||||
<span>Jobs</span>
|
||||
<span>Repos</span>
|
||||
<span>Alerts</span>
|
||||
<span>Audit</span>
|
||||
<span>Settings</span>
|
||||
</nav>
|
||||
<div class="right">user: alice (admin) · logout</div>
|
||||
</div>
|
||||
|
||||
<div class="screen-body annotated">
|
||||
<div>
|
||||
<!-- Fleet summary strip -->
|
||||
<div class="grid-3" style="margin-bottom:24px">
|
||||
<div class="box solid">
|
||||
<div class="label">Fleet status</div>
|
||||
<div class="value strong">10 online · 1 offline · 1 degraded</div>
|
||||
<div class="small">Last sync 12s ago</div>
|
||||
</div>
|
||||
<div class="box solid">
|
||||
<div class="label">Storage (sum across repos)</div>
|
||||
<div class="value strong">2.4 TB across 12 repos</div>
|
||||
<div class="small">+18 GB last 24h</div>
|
||||
</div>
|
||||
<div class="box solid">
|
||||
<div class="label">Open alerts</div>
|
||||
<div class="value strong">3 · 1 critical</div>
|
||||
<div class="small">2 unacked</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Filter / search -->
|
||||
<div class="row" style="margin-bottom:16px; align-items:center">
|
||||
<div class="box" style="flex:3">[ search hosts · filter by tag · status ]</div>
|
||||
<div style="flex:0">
|
||||
<span class="btn">+ Add host</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<h3 style="margin: 24px 0 12px">Hosts</h3>
|
||||
|
||||
<!-- Host card grid -->
|
||||
<div class="grid-3">
|
||||
|
||||
<!-- card: healthy -->
|
||||
<div class="box solid">
|
||||
<div style="display:flex; align-items:center; justify-content:space-between">
|
||||
<div class="strong">prod-db-01 <span class="small">linux/amd64</span></div>
|
||||
<span class="pill"><span class="dot ok"></span>online</span>
|
||||
</div>
|
||||
<hr style="border:none; border-top:1px dashed var(--soft); margin:8px 0">
|
||||
<div class="label">Last backup</div>
|
||||
<div class="value">2h ago · success</div>
|
||||
<div class="label" style="margin-top:8px">Repo</div>
|
||||
<div class="value">412 GB · 1,284 snapshots</div>
|
||||
<div class="label" style="margin-top:8px">Alerts</div>
|
||||
<div class="value">—</div>
|
||||
<div style="margin-top:12px; display:flex; gap:8px">
|
||||
<span class="btn">View</span>
|
||||
<span class="btn ghost">Backup now</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- card: failed last -->
|
||||
<div class="box solid">
|
||||
<div style="display:flex; align-items:center; justify-content:space-between">
|
||||
<div class="strong">staging-app <span class="small">linux/arm64</span></div>
|
||||
<span class="pill"><span class="dot degraded"></span>degraded</span>
|
||||
</div>
|
||||
<hr style="border:none; border-top:1px dashed var(--soft); margin:8px 0">
|
||||
<div class="label">Last backup</div>
|
||||
<div class="value">9h ago · <span class="strong">failed</span></div>
|
||||
<div class="label" style="margin-top:8px">Repo</div>
|
||||
<div class="value">88 GB · 412 snapshots</div>
|
||||
<div class="label" style="margin-top:8px">Alerts</div>
|
||||
<div class="value">2 · 1 critical</div>
|
||||
<div style="margin-top:12px; display:flex; gap:8px">
|
||||
<span class="btn">View</span>
|
||||
<span class="btn ghost">Retry</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- card: offline -->
|
||||
<div class="box solid">
|
||||
<div style="display:flex; align-items:center; justify-content:space-between">
|
||||
<div class="strong">laptop-bob <span class="small">windows/amd64</span></div>
|
||||
<span class="pill"><span class="dot off"></span>offline</span>
|
||||
</div>
|
||||
<hr style="border:none; border-top:1px dashed var(--soft); margin:8px 0">
|
||||
<div class="label">Last seen</div>
|
||||
<div class="value">3d ago</div>
|
||||
<div class="label" style="margin-top:8px">Repo</div>
|
||||
<div class="value">142 GB · 88 snapshots</div>
|
||||
<div class="label" style="margin-top:8px">Alerts</div>
|
||||
<div class="value">1</div>
|
||||
<div style="margin-top:12px; display:flex; gap:8px">
|
||||
<span class="btn">View</span>
|
||||
<span class="btn ghost" style="opacity:.4">Backup now</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- placeholder rest -->
|
||||
<div class="box placeholder">… more host cards (12 total in target deployment)</div>
|
||||
</div>
|
||||
|
||||
<!-- Recent jobs -->
|
||||
<h3 style="margin: 32px 0 12px">Recent activity (fleet-wide)</h3>
|
||||
<div class="box solid" style="padding:0">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>When</th><th>Host</th><th>Kind</th><th>Status</th><th>Duration</th><th></th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr><td>2h ago</td><td>prod-db-01</td><td>backup</td><td>succeeded</td><td>00:14:22</td><td><span class="small">view</span></td></tr>
|
||||
<tr><td>3h ago</td><td>web-02</td><td>backup</td><td>succeeded</td><td>00:08:11</td><td><span class="small">view</span></td></tr>
|
||||
<tr><td>9h ago</td><td>staging-app</td><td>backup</td><td><span class="strong">failed</span></td><td>00:01:03</td><td><span class="small">view</span></td></tr>
|
||||
<tr><td>1d ago</td><td>prod-db-01</td><td>check</td><td>succeeded</td><td>00:42:17</td><td><span class="small">view</span></td></tr>
|
||||
<tr><td>1d ago</td><td>web-01</td><td>prune</td><td>succeeded</td><td>00:04:55</td><td><span class="small">view</span></td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- annotation lane -->
|
||||
<aside class="ann-lane">
|
||||
<h4>Data sources</h4>
|
||||
<ul>
|
||||
<li><strong>Fleet summary strip</strong> — no endpoint in §6.1. Either (a) add <code>GET /api/fleet/summary</code> or (b) compute client-side from <code>GET /api/hosts</code> + <code>GET /api/alerts</code>. <em>Recommend (a)</em> — cheaper than fanout, and Prometheus already needs the rollup (§14.4).</li>
|
||||
<li><strong>Host cards</strong> — <code>GET /api/hosts</code> must return: status, last_backup_at, last_backup_status, repo_size_bytes, snapshot_count, open_alert_count, agent_version. Domain model (§5) only has <code>status</code> + <code>last_seen_at</code>. Need to extend list response.</li>
|
||||
<li><strong>"Backup now" button</strong> — <code>POST /api/hosts/:id/jobs</code> with <code>{kind: "backup"}</code>.</li>
|
||||
<li><strong>Recent activity</strong> — <code>GET /api/jobs?limit=N&order=desc</code>. Spec doesn't document query params; need to add.</li>
|
||||
<li><strong>HTMX cadence</strong> — this page polls every ~10s with <code>hx-trigger="every 10s"</code> on the summary + cards. WS push isn't needed here.</li>
|
||||
</ul>
|
||||
</aside>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
|
||||
<!-- ============================================================ -->
|
||||
<!-- SCREEN 2 · HOST DETAIL -->
|
||||
<!-- ============================================================ -->
|
||||
<section id="host-detail" class="screen">
|
||||
<span class="screen-label">Screen 2 · Host detail (/hosts/:id)</span>
|
||||
|
||||
<div class="chrome">
|
||||
<div class="logo">restic-manager</div>
|
||||
<nav>
|
||||
<span>Dashboard</span>
|
||||
<span class="active">Hosts</span>
|
||||
<span>Jobs</span>
|
||||
<span>Repos</span>
|
||||
<span>Alerts</span>
|
||||
<span>Audit</span>
|
||||
<span>Settings</span>
|
||||
</nav>
|
||||
<div class="right">user: alice (admin)</div>
|
||||
</div>
|
||||
|
||||
<div class="screen-body annotated">
|
||||
<div>
|
||||
<!-- Host header -->
|
||||
<div class="box solid" style="margin-bottom:24px">
|
||||
<div style="display:flex; align-items:flex-start; justify-content:space-between; gap:16px">
|
||||
<div>
|
||||
<div class="small">« Dashboard / Hosts</div>
|
||||
<h2 style="margin:4px 0">prod-db-01</h2>
|
||||
<div class="small">linux/amd64 · agent 0.4.2 · restic 0.17.1 · last seen 12s ago</div>
|
||||
<div style="margin-top:8px">
|
||||
<span class="pill"><span class="dot ok"></span>online</span>
|
||||
<span class="pill">tag: prod</span>
|
||||
<span class="pill">tag: db</span>
|
||||
</div>
|
||||
</div>
|
||||
<div style="display:flex; flex-direction:column; gap:6px; align-items:flex-end">
|
||||
<div class="small">Currently: <span class="strong">idle</span></div>
|
||||
<div style="display:flex; gap:8px">
|
||||
<span class="btn">Backup now</span>
|
||||
<span class="btn ghost">Run check</span>
|
||||
<span class="btn ghost">…</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Tabs -->
|
||||
<div class="tabs">
|
||||
<a href="#" class="active">Snapshots</a>
|
||||
<a href="#">Schedules</a>
|
||||
<a href="#">Jobs</a>
|
||||
<a href="#">Repo</a>
|
||||
<a href="#">Settings</a>
|
||||
</div>
|
||||
|
||||
<!-- TAB: Snapshots (active) -->
|
||||
<div>
|
||||
<div class="row" style="margin-bottom:12px">
|
||||
<div class="box" style="flex:3">[ filter by tag · path · date range ]</div>
|
||||
<div class="box" style="flex:1">[ sort: newest first ]</div>
|
||||
</div>
|
||||
<div class="box solid" style="padding:0">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Snapshot</th><th>Time</th><th>Paths</th><th>Tags</th><th>Size</th><th>Files</th><th></th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr><td><code>3a8f1e</code></td><td>2h ago</td><td>/var/lib/postgres</td><td>auto, daily</td><td>412 GB</td><td>1.2M</td><td><span class="small">restore · diff</span></td></tr>
|
||||
<tr><td><code>8c7b22</code></td><td>1d ago</td><td>/var/lib/postgres</td><td>auto, daily</td><td>411 GB</td><td>1.2M</td><td><span class="small">restore · diff</span></td></tr>
|
||||
<tr><td><code>4f0a99</code></td><td>2d ago</td><td>/var/lib/postgres, /etc</td><td>auto, weekly</td><td>411 GB</td><td>1.2M</td><td><span class="small">restore · diff</span></td></tr>
|
||||
<tr><td colspan="7" class="small" style="text-align:center; padding:12px">… 1,281 more · load more</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Other tabs collapsed previews -->
|
||||
<hr style="margin:32px 0; border:none; border-top:1px dashed var(--soft)">
|
||||
<div class="small" style="margin-bottom:8px">Other tabs (preview, not navigated):</div>
|
||||
|
||||
<div class="grid-2">
|
||||
|
||||
<!-- TAB: Schedules -->
|
||||
<div class="box solid">
|
||||
<div class="strong" style="margin-bottom:8px">Tab · Schedules</div>
|
||||
<table>
|
||||
<thead>
|
||||
<tr><th>Kind</th><th>Cron</th><th>Paths</th><th>Retention</th><th>Enabled</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr><td>backup</td><td>0 2 * * *</td><td>/var/lib/postgres</td><td>7d/4w/12m</td><td>[x]</td></tr>
|
||||
<tr><td>forget+prune</td><td>0 4 * * 0</td><td>—</td><td>per policy</td><td>[x]</td></tr>
|
||||
<tr><td>check</td><td>0 5 1 * *</td><td>—</td><td>—</td><td>[ ]</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<div style="margin-top:12px"><span class="btn">+ New schedule</span></div>
|
||||
<details style="margin-top:12px">
|
||||
<summary>schedule editor (expanded form)</summary>
|
||||
<div class="stack" style="margin-top:8px">
|
||||
<div class="box">kind: [backup ▾]</div>
|
||||
<div class="box">cron: [ 0 2 * * * ] <span class="small">human: every day at 02:00</span></div>
|
||||
<div class="box">paths: [ /var/lib/postgres ] [+ add]</div>
|
||||
<div class="box">excludes: [ *.tmp, /tmp ]</div>
|
||||
<div class="box">tags: [ auto, daily ]</div>
|
||||
<div class="box">retention: keep [7] daily, [4] weekly, [12] monthly · keep-tag [ ]</div>
|
||||
<div class="box">bandwidth: upload [ ] KB/s · download [ ] KB/s <span class="small">§14.2</span></div>
|
||||
<div class="box">pre-hook: [ pg_dump ... ] <span class="small">§14.3 admin-only</span></div>
|
||||
<div class="box">post-hook: [ ... ]</div>
|
||||
<div class="box">enabled: [x]</div>
|
||||
</div>
|
||||
</details>
|
||||
</div>
|
||||
|
||||
<!-- TAB: Jobs -->
|
||||
<div class="box solid">
|
||||
<div class="strong" style="margin-bottom:8px">Tab · Jobs (host-scoped)</div>
|
||||
<table>
|
||||
<thead>
|
||||
<tr><th>Started</th><th>Kind</th><th>Status</th><th>Duration</th><th>By</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr><td>2h ago</td><td>backup</td><td>succeeded</td><td>00:14:22</td><td>schedule</td></tr>
|
||||
<tr><td>1d ago</td><td>check</td><td>succeeded</td><td>00:42:17</td><td>schedule</td></tr>
|
||||
<tr><td>2d ago</td><td>backup</td><td>cancelled</td><td>00:00:42</td><td>alice</td></tr>
|
||||
<tr><td>3d ago</td><td>backup</td><td>failed</td><td>00:01:09</td><td>schedule</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
|
||||
<!-- TAB: Repo -->
|
||||
<div class="box solid">
|
||||
<div class="strong" style="margin-bottom:8px">Tab · Repo</div>
|
||||
<div class="grid-2">
|
||||
<div><div class="label">URL</div><div>rest:https://restic.lab…/prod-db-01</div></div>
|
||||
<div><div class="label">Kind</div><div>rest (append-only)</div></div>
|
||||
<div><div class="label">Total size</div><div>412 GB</div></div>
|
||||
<div><div class="label">Dedup ratio</div><div>4.2×</div></div>
|
||||
<div><div class="label">Snapshots</div><div>1,284</div></div>
|
||||
<div><div class="label">Last check</div><div>1d ago · clean</div></div>
|
||||
<div><div class="label">Lock state</div><div>unlocked</div></div>
|
||||
<div><div class="label">Credential</div><div>append-only · rotated 14d ago</div></div>
|
||||
</div>
|
||||
<div style="margin-top:12px; display:flex; gap:8px">
|
||||
<span class="btn">Run check</span>
|
||||
<span class="btn ghost">Unlock</span>
|
||||
<span class="btn ghost">Forget+prune (admin)</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- TAB: Settings -->
|
||||
<div class="box solid">
|
||||
<div class="strong" style="margin-bottom:8px">Tab · Settings</div>
|
||||
<div class="stack">
|
||||
<div class="box"><div class="label">Tags</div><div>prod, db [+ add]</div></div>
|
||||
<div class="box"><div class="label">Default pre-hook</div><div>(empty)</div></div>
|
||||
<div class="box"><div class="label">Default post-hook</div><div>(empty)</div></div>
|
||||
<div class="box"><div class="label">Hook shell</div><div>/bin/sh</div></div>
|
||||
<div class="box"><div class="label">Default bandwidth caps</div><div>none</div></div>
|
||||
<div class="box">
|
||||
<div class="label">Enrollment</div>
|
||||
<div>enrolled 42d ago · <span class="btn ghost">Regenerate token</span></div>
|
||||
</div>
|
||||
<div class="box">
|
||||
<div class="label">Agent</div>
|
||||
<div>0.4.2 · auto-update [x] · <span class="btn ghost">Force update now</span></div>
|
||||
</div>
|
||||
<div class="box">
|
||||
<div class="label danger" style="color:var(--note)">Danger zone</div>
|
||||
<div><span class="btn danger">Remove host</span> <span class="small">does not touch repo data</span></div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- annotations -->
|
||||
<aside class="ann-lane">
|
||||
<h4>Data sources</h4>
|
||||
<ul>
|
||||
<li><strong>Host header</strong> — <code>GET /api/hosts/:id</code>. <em>Gap:</em> "currently running job" not in domain model. Either join a <code>current_job_id</code> on Host, or have UI poll <code>GET /api/jobs?host_id=X&status=running</code>.</li>
|
||||
<li><strong>Snapshots tab</strong> — <code>GET /api/hosts/:id/snapshots</code>. Filtering needs server support: <code>?tag=</code>, <code>?path=</code>, <code>?since=</code>. Tag autocomplete needs distinct list — either client-derived or new endpoint.</li>
|
||||
<li><strong>Schedules tab</strong> — <code>GET /api/hosts/:id/schedules</code> + <code>POST/PUT/DELETE</code>. Editor exposes §14.2 bandwidth and §14.3 hooks — both stored as JSON blobs on Schedule, but UI needs structured fields. Confirm <code>retention_policy</code> JSON shape.</li>
|
||||
<li><strong>Jobs tab</strong> — <code>GET /api/jobs?host_id=X</code>. <em>Gap:</em> "By" column wants user-or-schedule attribution. AuditLog has it; Job table doesn't expose <code>actor</code> directly. Either denormalize onto Job or join.</li>
|
||||
<li><strong>Repo tab</strong> — <code>GET /api/hosts/:id/repo</code>. <em>Gap:</em> spec lists size/last-check/lock state. Add: dedup ratio, snapshot count, credential rotation timestamp, append-only flag. (Some derive from <code>restic stats</code>.)</li>
|
||||
<li><strong>Settings tab</strong> — mostly host-row edits. New: <code>POST /api/hosts/:id/agent/update</code> for force-update (§4.2 self-update). <em>Gap:</em> spec doesn't surface this.</li>
|
||||
<li><strong>HTMX cadence</strong> — tab content swap via <code>?tab=jobs</code> hyperlinks (server renders partial). Header polls every 10s for currently-running state.</li>
|
||||
</ul>
|
||||
</aside>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
|
||||
<!-- ============================================================ -->
|
||||
<!-- SCREEN 3 · JOB DETAIL -->
|
||||
<!-- ============================================================ -->
|
||||
<section id="job-detail" class="screen">
|
||||
<span class="screen-label">Screen 3 · Job detail (/jobs/:id) — running state</span>
|
||||
|
||||
<div class="chrome">
|
||||
<div class="logo">restic-manager</div>
|
||||
<nav>
|
||||
<span>Dashboard</span>
|
||||
<span>Hosts</span>
|
||||
<span class="active">Jobs</span>
|
||||
<span>Repos</span>
|
||||
<span>Alerts</span>
|
||||
<span>Audit</span>
|
||||
<span>Settings</span>
|
||||
</nav>
|
||||
<div class="right">user: alice (admin)</div>
|
||||
</div>
|
||||
|
||||
<div class="screen-body annotated">
|
||||
<div>
|
||||
|
||||
<!-- Header -->
|
||||
<div class="box solid" style="margin-bottom:16px">
|
||||
<div class="small">« prod-db-01 / Jobs</div>
|
||||
<div style="display:flex; align-items:flex-start; justify-content:space-between; gap:16px; margin-top:4px">
|
||||
<div>
|
||||
<h2 style="margin:0">backup · prod-db-01</h2>
|
||||
<div class="small">job <code>j_01HJ8K7</code> · started 4m12s ago · triggered by alice</div>
|
||||
<div style="margin-top:8px">
|
||||
<span class="pill"><span class="dot ok"></span>running</span>
|
||||
<span class="pill">schedule: nightly-pg</span>
|
||||
</div>
|
||||
</div>
|
||||
<div style="display:flex; gap:8px">
|
||||
<span class="btn danger">Cancel job</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Progress -->
|
||||
<div class="grid-2" style="margin-bottom:16px">
|
||||
<div class="box solid">
|
||||
<div class="label">Progress</div>
|
||||
<div class="value strong" style="margin:4px 0">38% · ~6m remaining</div>
|
||||
<div class="progress"><span></span></div>
|
||||
<div class="small" style="margin-top:6px">156 GB of 412 GB · 482k of 1.2M files</div>
|
||||
</div>
|
||||
<div class="box solid">
|
||||
<div class="grid-2">
|
||||
<div><div class="label">Files new</div><div>2,103</div></div>
|
||||
<div><div class="label">Files changed</div><div>418</div></div>
|
||||
<div><div class="label">Bytes added</div><div>2.4 GB</div></div>
|
||||
<div><div class="label">Throughput</div><div>42 MB/s</div></div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Live log -->
|
||||
<div class="label" style="margin-bottom:6px">Live log <span class="small">(streaming via WS)</span></div>
|
||||
<div class="log">
|
||||
<span class="ts">14:02:11</span> [agent] starting restic backup --json
|
||||
<span class="ts">14:02:11</span> [agent] pre_hook: pg_dump | gzip > /tmp/dump.sql.gz
|
||||
<span class="ts">14:02:48</span> [pre_hook] dump complete (1.2 GB)
|
||||
<span class="ts">14:02:49</span> [restic] open repository
|
||||
<span class="ts">14:02:50</span> [restic] lock repository
|
||||
<span class="ts">14:02:50</span> [restic] load index files
|
||||
<span class="ts">14:02:53</span> [restic] start scan
|
||||
<span class="ts">14:02:55</span> [restic] start backup on /var/lib/postgres
|
||||
<span class="ts">14:03:01</span> [restic] {"message_type":"status","percent_done":0.04,"total_files":1234567,"files_done":48234,"total_bytes":442000000000,"bytes_done":17600000000}
|
||||
<span class="ts">14:04:22</span> [restic] {"message_type":"status","percent_done":0.18,"...}
|
||||
<span class="ts">14:05:55</span> [restic] {"message_type":"status","percent_done":0.31,"...}
|
||||
<span class="ts">14:06:23</span> <span class="err">[restic] warning: failed to lstat /var/lib/postgres/pg_wal/.lock</span>
|
||||
<span class="ts">14:06:24</span> [restic] {"message_type":"status","percent_done":0.38,"...}
|
||||
<span style="color:#888">▌</span>
|
||||
</div>
|
||||
<div class="row" style="margin-top:8px">
|
||||
<div><span class="small">[ ] auto-scroll [ ] show stderr only download full log</span></div>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
||||
<aside class="ann-lane">
|
||||
<h4>Data sources</h4>
|
||||
<ul>
|
||||
<li><strong>Header</strong> — <code>GET /api/jobs/:id</code>. Need: kind, host, started_at, actor (user / schedule / system), status, schedule_id, schedule_name. <em>Gap:</em> Job table has <code>scheduled_id</code> but no actor/user_id; need to join AuditLog or denormalize.</li>
|
||||
<li><strong>Progress block</strong> — live updates from <code>WS /api/jobs/:id/stream</code>. The WS message <code>job.progress</code> (§6.2) needs a documented JSON shape: <code>{percent_done, files_done, total_files, bytes_done, total_bytes, eta_seconds, throughput_bps}</code>. Spec leaves this vague.</li>
|
||||
<li><strong>Stats panel</strong> — on completion mirrors <code>restic backup --json</code> summary fields: <code>files_new</code>, <code>files_changed</code>, <code>files_unmodified</code>, <code>data_added</code>, <code>total_bytes_processed</code>, <code>duration</code>, <code>snapshot_id</code>. Lives in <code>Job.stats</code> JSON.</li>
|
||||
<li><strong>Live log</strong> — <code>WS</code> messages of type <code>log.stream</code> (agent → server) fan out to browsers subscribed to <code>/api/jobs/:id/stream</code>. UI distinguishes <code>stdout</code> / <code>stderr</code> / <code>event</code> — the schema's <code>JobLog.stream</code> enum already covers this.</li>
|
||||
<li><strong>Cancel</strong> — <code>POST /api/jobs/:id/cancel</code> → server emits <code>command.cancel</code> WS to agent (§6.2). UI should optimistically show "cancelling…" until WS confirms <code>job.finished</code>.</li>
|
||||
<li><strong>HTMX caveat</strong> — this is the one screen where progressive enhancement isn't enough; live log requires WS. Plan: <code>hx-ext="ws"</code> with <code>ws-connect</code>, server sends innerHTML-fragment patches for the progress + log areas. Falls back to 2s polling without WS.</li>
|
||||
</ul>
|
||||
</aside>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
|
||||
<!-- ============================================================ -->
|
||||
<!-- FINDINGS -->
|
||||
<!-- ============================================================ -->
|
||||
<section id="findings" class="findings">
|
||||
<h3>Findings — gaps in spec.md §6 surfaced by Phase 0 wireframing</h3>
|
||||
<ol>
|
||||
<li>
|
||||
<strong>Aggregate fleet endpoint missing.</strong> Dashboard summary strip and Prometheus metrics (§14.4) both need fleet rollups. Add <code>GET /api/fleet/summary</code> returning host counts by status, total repo bytes, open alert counts. Cheaper than client fanout and reused by /metrics.
|
||||
</li>
|
||||
<li>
|
||||
<strong>Host list response is too thin.</strong> Domain model Host (§5) has status + last_seen_at; cards need <code>last_backup_at</code>, <code>last_backup_status</code>, <code>repo_size_bytes</code>, <code>snapshot_count</code>, <code>open_alert_count</code>, <code>current_job_id</code>. Either add columns or compute server-side and include in <code>GET /api/hosts</code>.
|
||||
</li>
|
||||
<li>
|
||||
<strong>Job actor not modelled.</strong> Job table tracks <code>scheduled_id</code> but not <em>who</em> (user vs schedule vs system) triggered a run-now. Dashboard "Recent activity" and Jobs tab both want this. Add <code>Job.actor_kind</code> + <code>Job.actor_id</code> — cheaper than joining AuditLog every time.
|
||||
</li>
|
||||
<li>
|
||||
<strong>WS <code>job.progress</code> JSON shape is undefined.</strong> §6.2 lists the message name only. Lock the shape now: <code>{percent_done: float, files_done: int, total_files: int, bytes_done: int, total_bytes: int, eta_seconds: int, throughput_bps: int}</code>. Keeps client + agent in lockstep before Phase 1 codes against it.
|
||||
</li>
|
||||
<li>
|
||||
<strong>Repo response needs more fields.</strong> §6.1 says size/last-check/lock state. Wireframe also wants: dedup ratio, snapshot count, credential rotation timestamp, append-only flag. Most derive from <code>restic stats</code> + Credential row — expose them through <code>GET /api/hosts/:id/repo</code>.
|
||||
</li>
|
||||
<li>
|
||||
<strong>Snapshot filtering needs server support.</strong> Tag/path/date filters belong on the server (12-host fleets are small but a single host can hold thousands of snapshots). Add query params to <code>GET /api/hosts/:id/snapshots</code>: <code>?tag=</code>, <code>?path=</code>, <code>?since=</code>, <code>?limit=</code>. Distinct-tag list endpoint optional — could be derived client-side at first.
|
||||
</li>
|
||||
<li>
|
||||
<strong>Job listing needs query params.</strong> Recent activity, host-scoped jobs, and the Jobs page all use <code>GET /api/jobs</code>. Lock down: <code>?host_id=</code>, <code>?kind=</code>, <code>?status=</code>, <code>?since=</code>, <code>?limit=</code>, <code>?order=</code>. Pagination too.
|
||||
</li>
|
||||
<li>
|
||||
<strong>Agent self-update endpoint not in §6.1.</strong> §4.2 describes the mechanism but no REST endpoint exists. Settings tab wants a "Force update now" button — add <code>POST /api/hosts/:id/agent/update</code>.
|
||||
</li>
|
||||
<li>
|
||||
<strong>Schedule retention/options JSON shape.</strong> §14.2 (bandwidth) and §14.3 (hooks) both extend <code>Schedule</code>. Document the canonical shape now (<code>retention_policy</code>, <code>options.limit_upload</code>, <code>options.limit_download</code>, <code>pre_hook</code>, <code>post_hook</code>) so the schedule editor and the agent can both target it.
|
||||
</li>
|
||||
<li>
|
||||
<strong>HTMX-vs-WS responsibility split.</strong> Decision: only the Job detail screen needs WS. Dashboard, Hosts, Snapshots use HTMX polling (10s). This avoids fan-out complexity for v1; revisit if dashboard feels stale.
|
||||
</li>
|
||||
</ol>
|
||||
</section>
|
||||
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
@@ -0,0 +1,455 @@
|
||||
# restic-manager — Specification
|
||||
|
||||
## 1. Overview
|
||||
|
||||
**restic-manager** is a self-hosted, browser-based, single-pane-of-glass for managing [restic](https://restic.net) backups across a fleet of Linux and Windows endpoints. It provides visibility, scheduling, ad-hoc operations, restore workflows, and alerting from one UI.
|
||||
|
||||
It is built for small-to-medium fleets (initial target: ~12 endpoints) and is intentionally simple to deploy: one Docker Compose file on the control-plane host, one small agent binary on each endpoint.
|
||||
|
||||
**License:** PolyForm Noncommercial 1.0.0
|
||||
|
||||
## 2. Goals & Non-Goals
|
||||
|
||||
### Goals
|
||||
- Central visibility into backup state for every endpoint
|
||||
- Trigger any restic operation remotely (`backup`, `forget`, `prune`, `check`, `unlock`, `snapshots`, `stats`, `diff`, `restore`)
|
||||
- Manage per-host backup schedules from the UI
|
||||
- Live job progress streamed back to the UI
|
||||
- Restore wizard (browse snapshots, pick paths, restore to original or alternate host)
|
||||
- Repo health surfacing (size, dedup ratio, last check, lock state)
|
||||
- Alerting on failure or staleness
|
||||
- Cross-platform agent (Linux + Windows)
|
||||
- Ransomware-resistant repo access via append-only credentials
|
||||
|
||||
### Non-Goals (initial release)
|
||||
- Replacing restic itself or providing custom repo formats
|
||||
- Managing non-restic backup tools
|
||||
- Multi-tenancy / SaaS deployment
|
||||
- High availability of the control plane (SQLite, single-instance)
|
||||
- Mobile-native apps (responsive web only)
|
||||
|
||||
## 3. Architecture
|
||||
|
||||
### 3.1 Components
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ Proxmox cluster │
|
||||
│ ┌────────────────────────────────────────────────────────────┐ │
|
||||
│ │ docker compose: restic-manager │ │
|
||||
│ │ - server (Go binary, REST + WS API, embedded HTMX UI) │ │
|
||||
│ │ - SQLite volume │ │
|
||||
│ └────────────────────────────────────────────────────────────┘ │
|
||||
└────────────────────────▲─────────────────────────────────────────┘
|
||||
│ HTTPS (control plane)
|
||||
│ - agent → server: status, telemetry
|
||||
│ - server → agent: commands, schedules
|
||||
│
|
||||
┌────────────────────────┴─────────────────────────────────────────┐
|
||||
│ Endpoints (Linux + Windows) │
|
||||
│ ┌──────────────────────┐ ┌────────────────────────────────┐ │
|
||||
│ │ restic-manager- │ │ restic CLI │ │
|
||||
│ │ agent (Go binary) │───▶│ invoked by agent │ │
|
||||
│ │ - systemd / svc │ └─────────────┬──────────────────┘ │
|
||||
│ │ - WS to server │ │ HTTPS │
|
||||
│ └──────────────────────┘ │ (data plane) │
|
||||
└─────────────────────────────────────────────┼────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ Unraid │
|
||||
│ ┌────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Docker: restic/rest-server │ │
|
||||
│ │ - per-host append-only credentials │ │
|
||||
│ │ - one repo per host │ │
|
||||
│ │ - storage: Unraid share │ │
|
||||
│ └────────────────────────────────────────────────────────────┘ │
|
||||
└──────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 3.2 Data flow
|
||||
|
||||
- **Backup data:** endpoint → restic CLI → restic REST server on Unraid → Unraid share. The control plane *never* touches backup bytes.
|
||||
- **Control plane:** agent maintains an outbound WebSocket to the server. Server pushes commands and schedule changes; agent pushes status, logs, live job progress, host metadata.
|
||||
- **UI:** browser → server (HTTPS, session cookies). Server fans out commands to agents, streams progress back to browser.
|
||||
|
||||
### 3.3 Why agent (not SSH)
|
||||
|
||||
- Push model works through NAT/firewalls without inbound rules
|
||||
- Native Windows support without OpenSSH service quirks
|
||||
- Local scheduling survives controller restarts
|
||||
- Self-contained `restic --json` parsing, no remote shell quoting hazards
|
||||
|
||||
### 3.4 Why per-host repos
|
||||
|
||||
- Isolates corruption / lock contention
|
||||
- Append-only credentials per host = compromised endpoint can't delete other hosts' backups
|
||||
- Simpler `prune` orchestration (no global lock coordination)
|
||||
- Trivially easy to retire a host (delete its repo + credential)
|
||||
|
||||
## 4. Components in detail
|
||||
|
||||
### 4.1 Server
|
||||
|
||||
- **Language:** Go 1.22+
|
||||
- **Storage:** SQLite (via `modernc.org/sqlite`, no CGo)
|
||||
- **HTTP:** `net/http` + `chi` router
|
||||
- **WebSocket:** `nhooyr.io/websocket`
|
||||
- **UI:** HTMX + Tailwind, server-rendered Go templates, no Node build step
|
||||
- **Distribution:** single static binary, packaged in a Docker image; published `docker-compose.yml`
|
||||
- **Config:** YAML or env vars (`RM_LISTEN`, `RM_DATA_DIR`, `RM_BASE_URL`, `RM_TLS_CERT`, `RM_TLS_KEY`)
|
||||
- **TLS:** terminate TLS in-process (cert from Caddy/Traefik sidecar acceptable; agents require HTTPS)
|
||||
|
||||
### 4.2 Agent
|
||||
|
||||
- **Language:** Go (cross-compiled for `linux/amd64`, `linux/arm64`, `windows/amd64`)
|
||||
- **Service integration:** systemd unit (Linux), Windows service via `golang.org/x/sys/windows/svc`
|
||||
- **Footprint goal:** ≤ 15 MB binary, ≤ 50 MB RSS idle
|
||||
- **Persistence:** local config file + small state DB (BoltDB or JSON) for queued reports if server is unreachable
|
||||
- **Restic invocation:** spawns `restic` with `--json`, parses streamed output, forwards to server in real time
|
||||
- **Self-update:** server publishes signed agent binary; agent downloads, verifies signature, swaps binary, restarts service
|
||||
|
||||
### 4.3 Restic REST server (Unraid)
|
||||
|
||||
- Run `restic/rest-server` Docker container
|
||||
- `--append-only` enabled
|
||||
- `--private-repos` enabled (each user only sees their own subpath)
|
||||
- htpasswd file with one user per host
|
||||
- Storage path mapped to Unraid share
|
||||
|
||||
## 5. Domain model
|
||||
|
||||
```
|
||||
Host
|
||||
id, name, os, arch, agent_version, restic_version,
|
||||
enrolled_at, last_seen_at, status (online/offline/degraded),
|
||||
repo_id (FK), tags,
|
||||
current_job_id (FK nullable),
|
||||
last_backup_at, last_backup_status (succeeded|failed|cancelled|null),
|
||||
repo_size_bytes, snapshot_count, open_alert_count
|
||||
# Last six fields are denormalised projections, refreshed on
|
||||
# job.finished, snapshots.report, repo.stats, and alert state changes.
|
||||
|
||||
Repo
|
||||
id, name, url, kind (rest|s3|local), credential_id (FK),
|
||||
password_secret_id (FK),
|
||||
size_bytes, snapshot_count, dedup_ratio,
|
||||
last_check_at, last_check_status, lock_state (locked|unlocked),
|
||||
append_only (bool), credential_rotated_at
|
||||
# Bottom block is a cached projection from `restic stats` +
|
||||
# Credential row, refreshed by repo.stats agent messages.
|
||||
|
||||
Credential
|
||||
id, kind, username, secret_ref (encrypted),
|
||||
rotated_at
|
||||
|
||||
Schedule
|
||||
id, host_id (FK), kind (backup|forget|prune|check),
|
||||
cron_expr, paths (json), excludes (json), tags (json),
|
||||
retention_policy (json), options (json), pre_hook, post_hook,
|
||||
enabled
|
||||
# retention_policy: {keep_last, keep_hourly, keep_daily, keep_weekly,
|
||||
# keep_monthly, keep_yearly, keep_tag: [...]}
|
||||
# options: {limit_upload_kbps, limit_download_kbps}
|
||||
# pre_hook/post_hook: see §14.3 (encrypted at rest)
|
||||
|
||||
Job
|
||||
id, host_id (FK), kind, status (queued|running|succeeded|failed|cancelled),
|
||||
scheduled_id (FK nullable),
|
||||
actor_kind (user|schedule|system), actor_id (nullable),
|
||||
started_at, finished_at,
|
||||
exit_code, stats (json), error
|
||||
|
||||
JobLog
|
||||
job_id (FK), seq, ts, stream (stdout|stderr|event), payload
|
||||
|
||||
Snapshot (cached projection from `restic snapshots --json`)
|
||||
id (restic id), host_id (FK), repo_id (FK),
|
||||
time, hostname, paths, tags, size_bytes, file_count
|
||||
|
||||
Alert
|
||||
id, host_id (FK nullable), kind, severity, message,
|
||||
created_at, acknowledged_at, resolved_at
|
||||
|
||||
User
|
||||
id, username, password_hash, role (admin|operator|viewer),
|
||||
created_at, last_login_at
|
||||
|
||||
Session
|
||||
id, user_id (FK), created_at, expires_at, ip, ua
|
||||
|
||||
AuditLog
|
||||
id, user_id (FK nullable), actor (user|agent|system),
|
||||
action, target_kind, target_id, ts, payload (json)
|
||||
```
|
||||
|
||||
## 6. API surface (control plane)
|
||||
|
||||
### 6.1 UI/REST (browser → server)
|
||||
|
||||
```
|
||||
POST /api/auth/login
|
||||
POST /api/auth/logout
|
||||
|
||||
GET /api/fleet/summary (aggregate: host counts by status,
|
||||
total bytes, open alerts; reused by /metrics)
|
||||
|
||||
GET /api/hosts ?tag=&status=&limit=&offset=
|
||||
(returns Host rows incl. denormalised
|
||||
last_backup_*, repo_size_bytes,
|
||||
snapshot_count, open_alert_count,
|
||||
current_job_id)
|
||||
GET /api/hosts/:id
|
||||
DELETE /api/hosts/:id
|
||||
POST /api/hosts/:id/enrollment-token (regenerate)
|
||||
POST /api/hosts/:id/agent/update (force agent self-update; see §4.2)
|
||||
|
||||
GET /api/hosts/:id/snapshots ?tag=&path=&since=&until=&limit=&offset=
|
||||
GET /api/hosts/:id/repo (full Repo projection)
|
||||
POST /api/hosts/:id/jobs (run-now: backup/forget/prune/check/unlock)
|
||||
POST /api/hosts/:id/restore (restore wizard submit)
|
||||
|
||||
GET /api/hosts/:id/schedules
|
||||
POST /api/hosts/:id/schedules
|
||||
PUT /api/schedules/:id
|
||||
DELETE /api/schedules/:id
|
||||
|
||||
GET /api/jobs ?host_id=&kind=&status=&since=&until=
|
||||
&limit=&offset=&order=desc
|
||||
GET /api/jobs/:id
|
||||
GET /api/jobs/:id/logs (paginated: ?after_seq=&limit=)
|
||||
WS /api/jobs/:id/stream (live progress; see §6.2 for shape)
|
||||
POST /api/jobs/:id/cancel
|
||||
|
||||
GET /api/repos
|
||||
GET /api/repos/:id
|
||||
|
||||
GET /api/alerts
|
||||
POST /api/alerts/:id/ack
|
||||
|
||||
GET /api/audit
|
||||
GET /api/users (admin)
|
||||
POST /api/users (admin)
|
||||
```
|
||||
|
||||
**Realtime strategy:** only `/api/jobs/:id/stream` uses WS. All other screens
|
||||
(dashboard, hosts, snapshots) refresh via HTMX polling (~10s cadence). Revisit
|
||||
if dashboard staleness becomes a problem in practice.
|
||||
|
||||
### 6.2 Agent ↔ Server
|
||||
|
||||
Single authenticated WebSocket per agent. Bidirectional JSON-RPC-ish messages.
|
||||
|
||||
**Agent → server:**
|
||||
- `hello` (host metadata, agent version, restic version, OS)
|
||||
- `heartbeat` (every 30s)
|
||||
- `job.started` (job_id, kind, started_at)
|
||||
- `job.progress` (job_id, percent_done, files_done, total_files,
|
||||
bytes_done, total_bytes, eta_seconds, throughput_bps)
|
||||
- `job.finished` (job_id, status, exit_code, stats, error, finished_at)
|
||||
- `snapshots.report` (full list after each successful backup)
|
||||
- `repo.stats` (size_bytes, snapshot_count, dedup_ratio, last_check_at,
|
||||
last_check_status, lock_state)
|
||||
- `log.stream` (live stdout/stderr lines while job running;
|
||||
{job_id, seq, ts, stream: stdout|stderr|event, payload})
|
||||
|
||||
**Server → agent:**
|
||||
- `command.run` (kind, args)
|
||||
- `command.cancel` (job_id)
|
||||
- `schedule.set` (full schedule list, agent reconciles local cron)
|
||||
- `config.update`
|
||||
- `agent.update` (new version available, URL + signature)
|
||||
|
||||
The server fans `job.progress` and `log.stream` for a given job to all
|
||||
browsers subscribed to `WS /api/jobs/:id/stream` (§6.1) without
|
||||
transformation, so the schema is shared end-to-end.
|
||||
|
||||
### 6.3 Enrollment
|
||||
|
||||
1. Operator clicks "Add host" → server generates one-time token (TTL 1h)
|
||||
2. Operator runs install script on endpoint with token
|
||||
3. Agent calls `POST /api/agents/enroll` with token + host metadata
|
||||
4. Server issues persistent agent credential (bearer token + TLS pin) and host record
|
||||
5. Agent stores credential, opens WS connection
|
||||
|
||||
## 7. Security
|
||||
|
||||
### 7.1 Authentication
|
||||
- **Phase 1:** username + password (argon2id), HTTP-only secure session cookies, CSRF tokens on state-changing requests
|
||||
- **Phase 2:** OIDC (Authelia, Keycloak, Authentik)
|
||||
- **Agents:** bearer token over TLS; pin server cert fingerprint at enrollment time
|
||||
|
||||
### 7.2 Authorization (Phase 1: simple roles)
|
||||
- **admin:** everything
|
||||
- **operator:** trigger jobs, edit schedules, restore
|
||||
- **viewer:** read-only
|
||||
|
||||
### 7.3 Secret handling
|
||||
- Restic repo passwords and REST-server credentials encrypted at rest in SQLite using a server-side key (loaded from env or file at startup)
|
||||
- Pushed to agents only over the authenticated WS, only when needed for a job
|
||||
- Agent stores them in OS keyring where available (Windows DPAPI, Linux Secret Service / fallback to encrypted file with restricted perms)
|
||||
|
||||
### 7.4 Repo protection
|
||||
- Restic REST server runs with `--append-only` for routine backups
|
||||
- A separate non-append-only credential exists for `forget`/`prune` operations, used only when explicitly invoked from the UI by an admin/operator and audited
|
||||
|
||||
### 7.5 Audit
|
||||
- Every state-changing UI action and every server→agent command logged with user, target, timestamp, and payload
|
||||
|
||||
## 8. UI
|
||||
|
||||
Stack: HTMX + Tailwind + Go html/templates. No SPA framework. Server-rendered, progressive enhancement.
|
||||
|
||||
**Pages:**
|
||||
- **Login**
|
||||
- **Dashboard:** fleet overview (host cards: status, last backup, repo size, alerts)
|
||||
- **Host detail:** tabs for Snapshots / Schedules / Jobs / Repo / Settings
|
||||
- **Job detail:** live log streaming via WS, cancel button
|
||||
- **Restore wizard:** host → snapshot → paths → target → confirm
|
||||
- **Repos:** aggregate view across hosts
|
||||
- **Alerts:** list, acknowledge
|
||||
- **Settings:** users (admin), notification channels, agent download
|
||||
- **Audit log**
|
||||
|
||||
## 9. Alerting
|
||||
|
||||
- **Triggers:** backup failed, backup hasn't run in N hours past its schedule, repo `check` failed, agent offline > N minutes, repo size growth anomaly
|
||||
- **Channels (Phase 1):** webhook, ntfy, email (SMTP)
|
||||
- **Channels (Phase 2+):** Discord, Slack, Pushover
|
||||
|
||||
## 10. Deployment
|
||||
|
||||
### 10.1 Control plane (Proxmox host or LXC)
|
||||
|
||||
`docker-compose.yml`:
|
||||
```yaml
|
||||
services:
|
||||
restic-manager:
|
||||
image: ghcr.io/<owner>/restic-manager:latest
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "8443:8443"
|
||||
volumes:
|
||||
- ./data:/data
|
||||
- ./certs:/certs:ro
|
||||
environment:
|
||||
- RM_DATA_DIR=/data
|
||||
- RM_LISTEN=:8443
|
||||
- RM_BASE_URL=https://restic.lab.example
|
||||
- RM_TLS_CERT=/certs/fullchain.pem
|
||||
- RM_TLS_KEY=/certs/privkey.pem
|
||||
- RM_SECRET_KEY_FILE=/data/secret.key
|
||||
```
|
||||
|
||||
### 10.2 Restic REST server (Unraid)
|
||||
|
||||
Standard `restic/rest-server` container, `--append-only`, `--private-repos`, htpasswd mounted, data path on the share.
|
||||
|
||||
### 10.3 Agent install
|
||||
|
||||
- **Linux:** `curl -fsSL https://restic.lab.example/install.sh | sudo RM_TOKEN=xxx sh`
|
||||
- **Windows:** `iwr https://restic.lab.example/install.ps1 | iex` (with `$env:RM_TOKEN`)
|
||||
- Installer drops binary + service unit, calls enroll endpoint, starts service
|
||||
|
||||
## 11. Testing strategy
|
||||
|
||||
- **Unit tests:** restic JSON parsing, schedule reconciliation, retention policy logic
|
||||
- **Integration tests:** spin up real `restic` + `rest-server` in Docker, exercise full backup/snapshot/restore flows
|
||||
- **End-to-end:** Playwright against a compose-up'd stack with one Linux agent in a sibling container
|
||||
- **Cross-platform agent CI:** build matrix `linux/amd64`, `linux/arm64`, `windows/amd64`; smoke test on Windows runner
|
||||
|
||||
## 12. Repository layout
|
||||
|
||||
```
|
||||
restic-manager/
|
||||
├── cmd/
|
||||
│ ├── server/
|
||||
│ └── agent/
|
||||
├── internal/
|
||||
│ ├── api/ # shared API types
|
||||
│ ├── server/
|
||||
│ │ ├── http/
|
||||
│ │ ├── ws/
|
||||
│ │ └── ui/ # templates, handlers
|
||||
│ ├── agent/
|
||||
│ │ ├── service/ # systemd / windows service glue
|
||||
│ │ ├── runner/ # restic invocation
|
||||
│ │ └── scheduler/
|
||||
│ ├── restic/ # restic CLI wrapper, --json parsing
|
||||
│ ├── store/ # sqlite layer
|
||||
│ ├── crypto/ # secret encryption
|
||||
│ └── auth/
|
||||
├── web/
|
||||
│ ├── templates/
|
||||
│ └── static/
|
||||
├── deploy/
|
||||
│ ├── docker-compose.yml
|
||||
│ ├── Dockerfile.server
|
||||
│ └── install/
|
||||
│ ├── install.sh
|
||||
│ └── install.ps1
|
||||
├── docs/
|
||||
├── LICENSE # PolyForm Noncommercial 1.0.0
|
||||
├── README.md
|
||||
├── spec.md
|
||||
└── tasks.md
|
||||
```
|
||||
|
||||
## 13. Phased delivery
|
||||
|
||||
- **Phase 1 (MVP):** server skeleton, agent skeleton, enrollment, host list, snapshot list, on-demand backup, live job log
|
||||
- **Phase 2:** schedules, retention, run-now for `forget`/`prune`/`check`/`unlock`, repo stats
|
||||
- **Phase 3:** restore wizard, alerts (webhook/ntfy/email), audit log
|
||||
- **Phase 4:** agent self-update, OIDC, multi-user/RBAC polish, repo trends
|
||||
- **Phase 5:** OSS readiness — docs site, contribution guide, screenshot tour
|
||||
|
||||
## 14. Confirmed extensions (in scope)
|
||||
|
||||
These were originally listed as open questions and have been confirmed for inclusion. Slotted into phases below.
|
||||
|
||||
### 14.1 Cross-host restore
|
||||
|
||||
Restore a snapshot taken on host A onto host B (e.g. recover a dead box onto a fresh one, clone a workload onto a sibling host, restore a developer's home dir onto a new laptop).
|
||||
|
||||
- **Credential model:** target host's agent receives a temporary, server-issued read credential for the source host's repo, scoped to a single restore job and revoked immediately after
|
||||
- **Path remapping:** UI allows rewriting source paths to target paths (e.g. `/home/alice` → `/home/alice-new`)
|
||||
- **Permissions:** restore runs as the agent's service user; UI surfaces a warning when source paths require root and target service user is non-root
|
||||
- **Phase:** 3 (with the restore wizard)
|
||||
|
||||
### 14.2 Bandwidth limiting
|
||||
|
||||
Per-host upload/download caps for backup, restore, and prune jobs.
|
||||
|
||||
- Exposed on the schedule editor as optional `--limit-upload` / `--limit-download` (KB/s)
|
||||
- Also overridable on run-now jobs via the UI
|
||||
- Persisted in `Schedule.options` (JSON blob) so the schema stays stable
|
||||
- **Phase:** 2 (with scheduling)
|
||||
|
||||
### 14.3 Pre/post backup hooks
|
||||
|
||||
Per-host shell commands run before and after a backup job. Use cases: `mysqldump`/`pg_dump` to a staging path, stop/start Docker containers, quiesce a service, post-backup notifications.
|
||||
|
||||
- **Schema:** `Schedule.pre_hook` and `Schedule.post_hook` (string, optional). For more complex cases, `Host.pre_hook_default` / `Host.post_hook_default` apply to all schedules on that host unless overridden
|
||||
- **Execution:** agent runs hooks via the host's default shell (`/bin/sh` Linux, `cmd.exe` or PowerShell Windows — host-configurable)
|
||||
- **Failure semantics:** `pre_hook` non-zero exit aborts the backup and marks the job failed. `post_hook` runs on both success and failure (with `RM_JOB_STATUS` env var); its own exit code is recorded but does not change the backup job's final status
|
||||
- **Stdout/stderr:** captured into `JobLog` like restic output, prefixed `pre_hook:` / `post_hook:`
|
||||
- **Security:** hooks are stored encrypted; only admins can edit them; every edit audit-logged
|
||||
- **Phase:** 2 (with scheduling)
|
||||
|
||||
### 14.4 Prometheus `/metrics` endpoint
|
||||
|
||||
Standard Prometheus exposition on `/metrics`, protected by either bearer token or IP allow-list.
|
||||
|
||||
- **Metrics (per host):**
|
||||
- `restic_manager_last_backup_timestamp_seconds{host=...}`
|
||||
- `restic_manager_last_backup_status{host=...}` (1=success, 0=failure)
|
||||
- `restic_manager_repo_size_bytes{host=...}`
|
||||
- `restic_manager_snapshot_count{host=...}`
|
||||
- `restic_manager_agent_online{host=...}` (1/0)
|
||||
- `restic_manager_job_duration_seconds_bucket{kind=...,host=...}` (histogram)
|
||||
- **Server-level:** `restic_manager_jobs_total{kind=...,status=...}`, `restic_manager_alerts_active`, `restic_manager_build_info`
|
||||
- **Phase:** 4 (alongside repo trend charts — both rely on the same time-series data)
|
||||
|
||||
## 15. Future considerations (not yet committed)
|
||||
|
||||
- Read-only share links for snapshot listings (auditor view) — out of scope for personal/lab use; revisit if multi-tenant or org use cases emerge
|
||||
@@ -0,0 +1,148 @@
|
||||
# restic-manager — Tasks
|
||||
|
||||
Tasks are grouped by phase. Each task has an ID for cross-referencing, an estimated size (S/M/L), and acceptance criteria.
|
||||
|
||||
Sizes: **S** = under a day, **M** = 1–3 days, **L** = 3–7 days.
|
||||
|
||||
---
|
||||
|
||||
## Phase 0 — Project bootstrap
|
||||
|
||||
- [ ] **P0-01** (S) Initialize Go module, `cmd/server`, `cmd/agent`, baseline `internal/` packages
|
||||
- [ ] **P0-02** (S) Add LICENSE (PolyForm Noncommercial 1.0.0), README stub, CONTRIBUTING placeholder
|
||||
- [ ] **P0-03** (S) Set up `golangci-lint`, `gofumpt`, `goimports`; pre-commit config
|
||||
- [ ] **P0-04** (S) GitHub Actions: build matrix (linux amd64/arm64, windows amd64), unit tests, lint
|
||||
- [ ] **P0-05** (S) `Dockerfile.server` (multi-stage, distroless), `deploy/docker-compose.yml`
|
||||
- [ ] **P0-06** (S) Makefile / `taskfile.yml` with common targets (`build`, `test`, `run`, `release`)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 — MVP: enrollment, visibility, on-demand backup
|
||||
|
||||
### Server foundations
|
||||
- [ ] **P1-01** (M) HTTP server scaffolding (`chi`, structured logging via `slog`, graceful shutdown)
|
||||
- [ ] **P1-02** (M) SQLite store layer (`modernc.org/sqlite`) + migrations (`golang-migrate` or hand-rolled)
|
||||
- [ ] **P1-03** (M) Schema for `users`, `sessions`, `hosts`, `repos`, `credentials`, `jobs`, `job_logs`, `snapshots`, `audit_log`
|
||||
- [ ] **P1-04** (M) Auth: argon2id password hashing, login/logout, session cookies, CSRF middleware
|
||||
- [ ] **P1-05** (S) First-run admin bootstrap (printed one-time setup token in server logs)
|
||||
- [ ] **P1-06** (M) Secret encryption helper (AEAD with key from `RM_SECRET_KEY_FILE`)
|
||||
- [ ] **P1-07** (M) Audit log writer + middleware
|
||||
|
||||
### Agent ↔ server protocol
|
||||
- [ ] **P1-08** (M) Define shared API types in `internal/api` (Go structs, JSON tags)
|
||||
- [ ] **P1-09** (L) WebSocket transport (`nhooyr.io/websocket`), framed JSON envelopes, request/response correlation, ping/pong, reconnect with backoff
|
||||
- [ ] **P1-10** (M) Enrollment flow: `POST /api/agents/enroll` with one-time token → returns persistent bearer + cert pin
|
||||
- [ ] **P1-11** (M) Agent registration on connect (`hello` message → upsert host record, mark online)
|
||||
- [ ] **P1-12** (S) Heartbeat handler (mark host offline after 90s without heartbeat)
|
||||
|
||||
### Agent foundations
|
||||
- [ ] **P1-13** (M) Agent config file (`/etc/restic-manager/agent.yaml` / `%PROGRAMDATA%\restic-manager\agent.yaml`)
|
||||
- [ ] **P1-14** (M) Service integration: systemd unit + Windows service entrypoint
|
||||
- [ ] **P1-15** (M) Outbound WS client with reconnect, server cert pinning
|
||||
- [ ] **P1-16** (M) Restic wrapper: locate `restic` binary, run with `--json`, stream parsed events
|
||||
- [ ] **P1-17** (S) Host metadata collection (OS, arch, hostname, restic version, agent version)
|
||||
|
||||
### Run-now backup
|
||||
- [ ] **P1-18** (L) Job lifecycle: queued → running → succeeded/failed/cancelled, persisted with logs
|
||||
- [ ] **P1-19** (M) Server endpoint `POST /api/hosts/:id/jobs` to dispatch a `backup` command
|
||||
- [ ] **P1-20** (M) Agent executes `restic backup`, streams stdout/stderr + parsed JSON events back as `job.progress` / `log.stream`
|
||||
- [ ] **P1-21** (M) Server persists log stream to `job_logs`, exposes `WS /api/jobs/:id/stream` for live tailing
|
||||
- [ ] **P1-22** (S) Snapshot listing: `restic snapshots --json`, cached projection table, refresh after each backup
|
||||
|
||||
### UI (HTMX + Tailwind)
|
||||
- [ ] **P1-23** (M) Base layout, login page, session-aware nav
|
||||
- [ ] **P1-24** (M) Dashboard: host cards (status dot, last backup, repo size)
|
||||
- [ ] **P1-25** (M) Host detail page: snapshots tab + run-now button
|
||||
- [ ] **P1-26** (M) Live job log viewer (WS-driven, auto-scroll, cancel button)
|
||||
- [ ] **P1-27** (S) "Add host" flow: generate token, copy install command snippet
|
||||
- [ ] **P1-28** (S) Tailwind build via `tailwindcss` standalone binary (no Node)
|
||||
|
||||
### Install scripts
|
||||
- [ ] **P1-29** (M) `install.sh` (Linux): detects arch, downloads agent, installs systemd unit, enrolls
|
||||
- [ ] **P1-30** (M) `install.ps1` (Windows): downloads agent, installs as service, enrolls
|
||||
- [ ] **P1-31** (S) Server endpoint to serve agent binaries + install scripts (signed)
|
||||
|
||||
### Phase 1 acceptance
|
||||
- One Linux + one Windows host can enroll, appear in the dashboard, and a backup can be triggered from the UI with live log streaming. Snapshots list updates after success.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 — Scheduling, retention, repo operations
|
||||
|
||||
- [ ] **P2-01** (M) Schedule schema + CRUD API
|
||||
- [ ] **P2-02** (L) Server-pushed schedule reconciliation (server is source of truth; agent applies)
|
||||
- [ ] **P2-03** (M) Agent local scheduler (`robfig/cron/v3`); persists next-fire times across restarts
|
||||
- [ ] **P2-04** (M) Schedule editor UI (paths, excludes, tags, cron, retention)
|
||||
- [ ] **P2-05** (M) `forget` command with retention policy (keep-last/daily/weekly/monthly/yearly)
|
||||
- [ ] **P2-06** (M) `prune` command (admin-only, uses non-append-only credential)
|
||||
- [ ] **P2-07** (S) `check` command (random subset + `--read-data-subset`)
|
||||
- [ ] **P2-08** (S) `unlock` command
|
||||
- [ ] **P2-09** (M) Repo stats panel: size, dedup ratio, snapshot count, last check time, lock state
|
||||
- [ ] **P2-10** (S) Run-now buttons for forget/prune/check/unlock on host detail page
|
||||
- [ ] **P2-11** (S) Schedule "next run" / "last run" surfaced on host card
|
||||
- [ ] **P2-12** (S) Bandwidth limit fields on schedule editor (`--limit-upload`, `--limit-download`); also overridable on run-now jobs
|
||||
- [ ] **P2-13** (M) Pre/post backup hooks: schema (`Schedule.pre_hook`, `Schedule.post_hook`, `Host.pre_hook_default`, `Host.post_hook_default`), encrypted at rest, admin-only edit, audit-logged
|
||||
- [ ] **P2-14** (M) Agent execution of hooks: configurable shell per host, `pre_hook` failure aborts backup, `post_hook` always runs with `RM_JOB_STATUS` env var, stdout/stderr captured into `JobLog` with prefix
|
||||
- [ ] **P2-15** (S) Hook editor UI on schedule + host pages, with sensible warnings (e.g. "this hook runs as the agent service user")
|
||||
|
||||
### Phase 2 acceptance
|
||||
- Schedules created in UI run on agents on time; retention is applied; admin can prune from UI; repo health visible per host. Pre/post hooks fire correctly (verified with a Docker stop/start example and a `mysqldump` example). Bandwidth limits honoured.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 — Restore, alerts, audit
|
||||
|
||||
- [ ] **P3-01** (L) Restore wizard backend: snapshot tree browse via `restic ls --json`, path picker, target selection
|
||||
- [ ] **P3-02** (L) Restore wizard UI (multi-step: host → snapshot → paths → target → confirm)
|
||||
- [ ] **P3-03** (M) Restore execution: `restic restore` invocation, progress streaming
|
||||
- [ ] **P3-04** (L) Cross-host restore: target agent receives a temporary scoped read credential for source host's repo (single-job, auto-revoked); UI supports source→target path remapping; warns when source paths need root and target service user is non-root
|
||||
- [ ] **P3-05** (M) Alert engine: rule evaluation loop (failed backup, stale schedule, agent offline, check failed)
|
||||
- [ ] **P3-06** (M) Notification channels: webhook, ntfy, SMTP email
|
||||
- [ ] **P3-07** (S) Alert UI: list, acknowledge, resolve
|
||||
- [ ] **P3-08** (S) Audit log UI with filters (user, action, target, time range)
|
||||
- [ ] **P3-09** (S) `diff` between two snapshots in UI
|
||||
|
||||
### Phase 3 acceptance
|
||||
- A file deleted on a host can be restored from the UI in under 2 minutes. A failed backup raises an alert via the configured channel within 60s.
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — Self-update, RBAC polish, OIDC
|
||||
|
||||
- [ ] **P4-01** (L) Agent self-update: signed binary published by server, agent downloads, verifies, swaps, restarts
|
||||
- [ ] **P4-02** (M) Agent version reporting on dashboard; "update all" admin action
|
||||
- [ ] **P4-03** (M) RBAC enforcement at API layer (admin / operator / viewer)
|
||||
- [ ] **P4-04** (S) User management UI (create/edit/disable, role assignment, password reset)
|
||||
- [ ] **P4-05** (L) OIDC login (generic provider config, group → role mapping)
|
||||
- [ ] **P4-06** (M) Repo size trend graphs (sparkline on host card, full chart on repo page)
|
||||
- [ ] **P4-07** (S) Per-host tags + dashboard filtering by tag
|
||||
- [ ] **P4-08** (M) Prometheus `/metrics` endpoint: per-host gauges (last backup timestamp, last backup status, repo size, snapshot count, agent online), server gauges (active alerts, build info), job duration histograms; protected by bearer token or IP allow-list
|
||||
- [ ] **P4-09** (S) Document Prometheus integration + sample Grafana dashboard JSON
|
||||
|
||||
### Phase 4 acceptance
|
||||
- Non-admin users see an appropriately limited UI. Agents update themselves with one click. OIDC login works against at least one provider (Authelia or Authentik). Prometheus can scrape `/metrics` and the sample Grafana dashboard renders with live data.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5 — OSS readiness
|
||||
|
||||
- [ ] **P5-01** (M) Documentation site (mdBook or similar) with install, concepts, security model, screenshots
|
||||
- [ ] **P5-02** (S) `CONTRIBUTING.md`, `CODE_OF_CONDUCT.md`, issue + PR templates
|
||||
- [ ] **P5-03** (S) Release automation: `goreleaser` for binaries + Docker image to GHCR
|
||||
- [ ] **P5-04** (S) Demo screenshots / short Loom walkthrough in README
|
||||
- [ ] **P5-05** (S) `SECURITY.md` with disclosure process
|
||||
- [ ] **P5-06** (M) End-to-end test suite in CI (Playwright vs. compose stack with sibling Linux agent)
|
||||
- [ ] **P5-07** (S) Sample `docker-compose.yml` with TLS via Caddy sidecar
|
||||
- [ ] **P5-08** (S) Optional Prometheus `/metrics` endpoint
|
||||
|
||||
### Phase 5 acceptance
|
||||
- A stranger can read the docs and stand up a working install in under 30 minutes.
|
||||
|
||||
---
|
||||
|
||||
## Cross-cutting / ongoing
|
||||
|
||||
- [ ] **X-01** Keep CHANGELOG.md updated (Keep-a-Changelog format)
|
||||
- [ ] **X-02** Track restic version compatibility matrix
|
||||
- [ ] **X-03** Periodic dependency updates (`dependabot` or `renovate`)
|
||||
- [ ] **X-04** Threat-model review at end of each phase
|
||||
Reference in New Issue
Block a user