Files

87 lines
5.2 KiB
Markdown

## Context
The kanban sync workflow is currently a 280-line markdown document (`.claude/commands/kanban-sync.md`) that the LLM reads and interprets each time sync is triggered. The sync logic is deterministic — it reads OpenSpec state, compares it with Planka board state, and reconciles. No LLM judgment is needed. The overhead is significant: each sync consumes tokens for reasoning through ~10 phases and executing ~15-30 shell commands.
There is also no concurrency protection. If two opsx workflows complete in quick succession, two syncs could run simultaneously and produce conflicting Planka API calls.
## Goals / Non-Goals
**Goals:**
- Replace LLM-interpreted sync with a standalone bash script
- Implement concurrency control so only one sync runs at a time
- Support coalescing: if sync is requested while one runs, queue exactly one follow-up
- Support foreground (default, for humans) and background (for LLM/skill invocation) modes
- Keep the existing sync logic and board structure unchanged
**Non-Goals:**
- Changing the sync algorithm or board layout
- Adding bidirectional sync (Planka → OpenSpec)
- Supporting multiple simultaneous boards
- Adding a daemon/service mode — this remains an on-demand script
## Decisions
### 1. Externally deployed bash script (`kanban-project-sync`)
**Rationale**: The sync logic uses `pcli`, `jq`, `yq`, and `openspec` CLI tools — all shell-native. A bash script is the natural fit. The script is deployed to developer instances via existing tooling and placed on PATH, like other shared scripts. It does not live in the project repo.
**Convention**: The script assumes it is run from the project root directory (uses `pwd` to locate `openspec/` directory). This is consistent with how `openspec` and `pcli` already work.
**Alternative considered**: A Go subcommand in pcli (`pcli sync`). Rejected because the sync orchestrates `openspec` (a separate tool) and would require shelling out anyway. The script approach keeps concerns separated.
**Alternative considered**: Placing the script in the project root alongside `build.sh`. Rejected — the script is a shared developer tool, not project-specific. External deployment keeps it consistent with other tooling.
### 2. `flock` for locking with pending file for coalescing
**Rationale**: `flock` is the standard POSIX advisory locking mechanism. It's atomic, handles process crashes (lock auto-releases), and avoids manual PID-file management. The pending flag file is a simple touch/rm mechanism that coalesces multiple queued requests into one re-run.
**Lock file**: `/tmp/kanban-project-sync-<project>-<board>.lock` — scoped per project-board to allow independent syncs for different boards.
**Pending file**: `/tmp/kanban-project-sync-<project>-<board>.pending` — touched by callers that can't acquire the lock.
**Flow**:
```
Caller:
Try flock (non-blocking)
├── Got lock:
│ loop:
│ run_sync()
│ if pending file exists: rm pending, continue loop
│ else: break
│ release lock (fd close)
└── Lock held:
touch pending file
exit 0
```
**Alternative considered**: PID file with `kill -0` checks. Rejected — racy, doesn't handle crashes cleanly, more complex than `flock`.
### 3. Foreground default, `--background` flag for detached mode
**Rationale**: Humans running the script manually want to see output. The LLM/skill always passes `--background` to avoid blocking. Background mode uses `nohup` + `&` with output redirected to a log file at `/tmp/kanban-project-sync-<project>-<board>.log`.
### 4. Script reads `project.yaml` for defaults but CLI args override
**Rationale**: `--project` and `--board` are required arguments. The script does NOT read `project.yaml` itself — that's the caller's concern. This keeps the script generic. The kanban-sync command/skill reads `project.yaml` and passes the values. A human can pass whatever project/board they want.
### 5. Exit codes
| Code | Meaning |
|------|---------|
| 0 | Success (sync completed) or queued (pending flag set) |
| 1 | Error (sync failed) |
| 2 | Skipped (Planka offline / connectivity check failed) |
When a caller can't acquire the lock and sets the pending flag, it exits 0 — from the caller's perspective, the sync will happen.
## Risks / Trade-offs
**[`flock` availability on macOS]** → macOS doesn't ship `flock` by default. Mitigation: document that `brew install util-linux` is needed. This is a dev tooling script, not a production binary — acceptable friction.
**[Background mode log management]** → Log files in `/tmp` accumulate. Mitigation: Each run overwrites the log file (not appends), so only the last run's output is kept per project-board pair.
**[Pending flag race window]** → Tiny window between "check pending" and "release lock" where a new pending could be missed. Mitigation: The check-and-release happens while still holding the flock, so no other process can set pending between check and release. The lock holder is the only one that reads/clears the pending file.
**[Script drift from sync logic]** → If the sync algorithm changes, the script must be updated manually (no longer auto-follows the markdown doc). Mitigation: The sync logic has been stable. The kanban-sync.md doc will reference the script, making it clear where changes go.