Files
emcli/specifications/SPEC.md
T
steve e3f8afbc7c Spec: replace read pointer with per-message seen-set model
Reading state is now a per-(account,folder) floor plus an acked set of
UIDs above it, instead of a single monotonic pointer. This makes
acknowledgement per-message and order-independent so concurrent
subagents can process and ack out of order. Internal compaction collapses
contiguous acked runs into the floor to bound storage. Adds stateless
search and ack commands; reads no longer mutate state.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 21:01:34 +01:00

16 KiB
Raw Blame History

emcli — Specification

Status: Draft for implementation Source: specifications/PRD.md Date: 2026-06-21

1. Summary

emcli is a single cross-platform Go binary that mediates all email access for an AI agent. The agent never holds email credentials and never connects to IMAP/SMTP directly — every read and send passes through emcli, which enforces user-configured restrictions. The goal is to contain the blast radius of agent hallucination when operating on live email: even with faulty instructions, the agent cannot read mail it is not permitted to see or send mail to recipients it is not permitted to contact.

2. Goals & non-goals

Goals

  • Single static binary, no runtime dependencies, cross-compiled per OS/arch.
  • Email credentials and OAuth tokens are encrypted at rest and never exposed to the agent.
  • Per-account enforcement of read-only/read-write mode, inbound/outbound whitelists, and subject filtering.
  • Machine-readable JSON for agent commands; human-readable/interactive output for admin.
  • Auditable history of agent actions with configurable retention.

Non-goals

  • Not a full email client (no threading UI, no flags/labels management beyond what is needed, no draft management).
  • No graphical UI. TUI is used only for init and (re)configuration.
  • No multi-user/server mode — emcli is a local utility invoked per-process.

3. Technology choices

Concern Choice
Language/runtime Go (no CGO)
IMAP github.com/emersion/go-imap
SMTP github.com/emersion/go-smtp
MIME parsing/building github.com/emersion/go-message
SASL / XOAUTH2 github.com/emersion/go-sasl
SQLite modernc.org/sqlite (pure Go)
TUI github.com/charmbracelet/bubbletea (+ lipgloss)
Crypto Go stdlib crypto/aes + crypto/cipher (AES-256-GCM)

Rationale: the emersion email libraries are mature and designed to be used together, including XOAUTH2 for Gmail. Pure-Go SQLite avoids CGO so the binary is genuinely static and trivially cross-compiled.

4. Architecture

The binary is organized into independently testable packages:

  • store — encrypted SQLite config and state. Owns the schema, migrations, and field-level encryption of secret columns.
  • policy — pure enforcement functions: mode (RO/RW), whitelist-in, whitelist-out, subject regex, folder access. No I/O. The single gate every agent action passes through.
  • mail — IMAP read and SMTP send, including SASL XOAUTH2 and password auth.
  • oauth — loopback-redirect consent flow, refresh-token storage, automatic access-token refresh.
  • audit — append-only action log with retention-based purge.
  • cli — command dispatch and the two output surfaces (agent JSON, admin human-readable/TUI).

Trust boundary

  • The agent invokes only the agent commands (Section 7.1).
  • EMCLI_KEY is supplied by the environment/orchestrator that launches emcli, never as an argument the agent constructs. The agent has no command that reveals secret values.
  • All policy decisions happen inside emcli; the agent cannot bypass them because it has no other path to the mail servers.

5. Configuration & secrets

  • Encryption key: EMCLI_KEY env var, a base64-encoded 32-byte key (AES-256). If absent or malformed, every command that touches the DB fails closed with an error envelope; no plaintext fallback.
  • Database path: EMCLI_DB env var; default ~/.config/emcli/emcli.db (%AppData%\emcli\emcli.db on Windows).
  • Field-level encryption: secret columns are stored as AES-256-GCM ciphertext with a random 96-bit nonce per value, prefixed to the ciphertext. Non-secret config remains plaintext for debuggability. Decryption with the wrong key fails (GCM auth tag) and is surfaced as an error, never silently ignored.

Secret columns: account password, OAuth client secret, OAuth refresh token.

6. Data model (SQLite)

accounts
  id                   INTEGER PK
  name                 TEXT UNIQUE          -- agent-facing identifier
  mode                 TEXT                 -- 'RO' | 'RW'
  imap_host            TEXT
  imap_port            INTEGER
  imap_security        TEXT                 -- 'tls' | 'starttls'
  smtp_host            TEXT                 -- nullable for RO accounts
  smtp_port            INTEGER
  smtp_security        TEXT                 -- 'tls' | 'starttls'
  auth_type            TEXT                 -- 'password' | 'oauth2'
  username             TEXT
  enc_password         BLOB                 -- encrypted (password auth)
  enc_oauth_client_id  BLOB                 -- encrypted (oauth2)
  enc_oauth_client_secret BLOB              -- encrypted (oauth2)
  enc_oauth_refresh_token BLOB              -- encrypted (oauth2)
  whitelist_in_enabled  INTEGER             -- 0 | 1
  whitelist_out_enabled INTEGER             -- 0 | 1
  subject_regex        TEXT                 -- nullable; blank/null = no subject filter
  process_backlog      INTEGER              -- 0 | 1; baseline policy for newly-seen folders
                                            --   0 (default) = floor at current max UID
                                            --   1           = floor at 0 (process existing mail)

whitelist_in
  account_id           INTEGER FK
  address              TEXT                 -- exact addr or '@domain.com'

whitelist_out
  account_id           INTEGER FK
  address              TEXT

folder_state                              -- one row per folder ever seen
  account_id           INTEGER FK
  folder               TEXT
  uidvalidity          INTEGER
  floor_uid            INTEGER              -- nothing at or below this is ever "new"
  PRIMARY KEY (account_id, folder)

acked                                     -- individual processed UIDs above the floor
  account_id           INTEGER FK
  folder               TEXT
  uidvalidity          INTEGER
  uid                  INTEGER
  PRIMARY KEY (account_id, folder, uid)

audit_log
  id                   INTEGER PK
  ts                   TEXT                 -- RFC3339 UTC
  account              TEXT
  action               TEXT                 -- 'list' | 'get' | 'send' | 'ack' | 'search'
  target               TEXT                 -- folder, UID(s), or recipient set
  result               TEXT                 -- 'allowed' | 'blocked'
  reason               TEXT                 -- nullable; populated on block

settings
  key                  TEXT PK
  value                TEXT
  -- includes: audit_retention_days, schema_version

Notes:

  • Folders are agent-specified; there is no folder whitelist. Read state is tracked per (account, folder).
  • "New" is a seen-set, not a watermark. A message is "new" when it exists in the folder, its uid > floor_uid, and its uid is not in acked. This makes acknowledgement per-message and order-independent — essential when the agent fans processing out to concurrent subagents that finish out of order.
  • Floor baseline. On first contact with a folder, floor_uid is set from the account's process_backlog policy: the current highest UID (default — existing mail is treated as already handled) or 0 (process the existing backlog).
  • Compaction (internal, invisible to the agent). When acked holds a contiguous run of UIDs immediately above floor_uid, that run is collapsed: floor_uid advances past it and the rows are deleted. This bounds storage without changing what counts as new. A folder processed strictly in order degenerates to a single floor value with an empty acked set (i.e. the watermark case); out-of-order processing leaves short-lived holes above the floor until they fill in.
  • UIDVALIDITY change. If the server reports a different UIDVALIDITY for a folder than the stored value, UIDs are no longer comparable: folder_state and acked for that folder are reset and the floor re-baselined per process_backlog.

7. Command surface

7.1 Agent commands (JSON output only)

All agent commands emit a single JSON object (Section 8) and nothing else on stdout.

All read commands are stateless — they never mutate floor or ack state. The only command that advances read state is ack.

emcli list --account <name> --folder <folder> [--new] [--before <uid>] [--since <uid>] [--limit N]

  • Returns message headers only: uid, from, to, subject, date, message_id, has_attachments. Newest-first.
  • --new filters to messages that are new per the seen-set rule (uid > floor_uid and not in acked). It does not advance any state.
  • --before <uid> / --since <uid> page through history by UID cursor (e.g. page older by passing the lowest UID from the previous page as --before).
  • --limit caps results (default applied if omitted; see 7.3).
  • Whitelist-in and subject-regex filtering are applied before results are returned (Section 9); filtered messages are invisible in every mode.

emcli get --account <name> --folder <folder> --uid <uid>

  • Returns full message: headers, decoded plain-text body, and attachments as {name, size, mime, content_b64}.
  • Does not ack the message — fetching to inspect is distinct from consuming.
  • If the message is filtered by whitelist-in or subject-regex, returns an error envelope (not-found) — the agent cannot retrieve filtered mail.

emcli search --account <name> --folder <folder> [--from <addr>] [--subject-contains <s>] [--text <s>] [--since <date>] [--before <date>] [--limit N]

  • Server-side IMAP SEARCH across the whole folder, regardless of floor/ack state. Returns the same headers-only shape as list. Useful for finding specific historical mail.
  • Subject to the same inbound filtering as list/get: filtered messages never appear.
  • Stateless.

emcli ack --account <name> --folder <folder> --uid <uid>…

  • Marks one or more UIDs as processed (adds them to the acked set; triggers compaction).
  • Idempotent, batchable, and order-independent — safe to call from concurrent subagents.
  • After ack, those UIDs no longer appear under list --new.
  • A filtered/invisible UID cannot be acked (returns not-found), preventing the agent from manipulating state for mail it isn't allowed to see.

emcli send --account <name> --to <addr>… [--cc <addr>…] [--bcc <addr>…] --subject <s> --body <text> [--attach <path>]… [--reply-to <uid>]

  • Sends a plain-text message via the account's SMTP endpoint.
  • --reply-to <uid> fetches the source message's Message-ID and References and sets In-Reply-To/References headers so the reply threads correctly. The referenced UID is read from the same account (subject to inbound filtering — a filtered source UID cannot be replied to).
  • Enforcement: RO accounts are rejected; whitelist-out (if enabled) must pass for every recipient across to/cc/bcc or the entire send is blocked (Section 9).

7.2 Admin commands (human-readable / TUI)

  • emcli init — TUI flow: creates the DB (generating schema), adds the first account, and runs OAuth consent if the account is OAuth2.
  • emcli account add | edit | remove | list — interactive add/edit; list prints a table (never secrets). account add accepts --process-backlog (default off) which sets the account's baseline policy: off ⇒ newly-seen folders floor at their current max UID (existing mail treated as handled); on ⇒ floor at 0 (existing mail is processed).
  • emcli whitelist in|out add|remove|list --account <name> — manage whitelist entries.
  • emcli config set|get — global settings (e.g. audit_retention_days).
  • emcli audit list [--account <name>] [--limit N] — view recent audit entries.
  • emcli doctor — verifies EMCLI_KEY is present and valid, the DB opens, and each account's IMAP/SMTP connectivity and auth succeed. Human-readable diagnostics.

7.3 Defaults & limits

  • list --limit default: 50; maximum: 500.
  • Attachment handling in get: full base64 contents are returned. (No size cap in v1; the caller is responsible for limits. Revisit if payloads prove unwieldy.)

8. JSON output envelope

Every agent command prints exactly one object:

{
  "error": false,
  "error_detail": {},
  "data": {}
}
  • error — boolean.
  • error_detail — object; empty {} on success, otherwise { "code": "...", "message": "..." }. Never contains secret values.
  • data — command-specific payload; {} or [] when not applicable.
  • Process exit code mirrors error (0 on success, non-zero on error) for scripting, but the JSON is authoritative.

9. Enforcement semantics

Enforcement lives entirely in the policy package and is exercised on every agent action.

Inbound (read: list, get, search, ack)

  • If whitelist_in_enabled, the message sender must match a whitelist_in entry.
  • If subject_regex is set (non-empty), the subject must match the regex.
  • A message that fails either check is invisible: excluded from list and search results, not retrievable via get, and not ackable via ack (all return not-found). The agent has no way to learn that the message exists or to alter read state for it.

Outbound (send)

  • If account mode is RO, send is rejected.
  • If whitelist_out_enabled, every recipient (to + cc + bcc) must match a whitelist_out entry. If any recipient fails, the entire send is blocked — no partial send.

Address matching (both directions)

  • Case-insensitive.
  • An entry of the form @domain.com matches any address at that domain.
  • Any other entry matches a full address exactly.

Audit

  • Every action (list, get, search, ack, send), allowed or blocked, writes one audit_log row.
  • Blocked actions record a reason (e.g. ro_mode, whitelist_out, filtered).
  • On each run that opens the DB, audit rows older than audit_retention_days are purged.

10. OAuth2 (Gmail and compatible)

  • The user supplies their own OAuth client ID/secret (registered in their Google Cloud project) during account configuration.
  • Consent uses the loopback redirect flow: emcli starts a temporary listener on 127.0.0.1:<ephemeral-port>, opens the consent URL, captures the authorization code on redirect, exchanges it for tokens, and stores the refresh token (encrypted).
  • Access tokens are obtained/refreshed automatically before IMAP/SMTP use and held only in memory.
  • IMAP and SMTP authenticate via SASL XOAUTH2 using the current access token.

11. Error handling

  • All agent-command failures return the JSON error envelope; they never crash with an uncaught panic or emit partial non-JSON output on stdout.
  • Categories: configuration/key errors, DB errors, network/connection errors, auth errors, policy blocks, and not-found. Each maps to a stable error_detail.code.
  • Secrets never appear in output, error details, or the audit log.

12. Testing

  • policy — table-driven unit tests covering the matrix of mode × whitelist-in × whitelist-out × subject-regex, including domain-match and case-insensitivity, and the "any recipient fails ⇒ whole send blocked" rule.
  • store — encryption round-trip; decryption with the wrong key fails closed; schema migration; floor baseline per process_backlog; seen-set membership (uid > floor and not acked); out-of-order ack correctness; compaction collapses contiguous runs into the floor; folder_state/acked reset on UIDVALIDITY change.
  • mail — integration tests against a containerized IMAP/SMTP server (e.g. GreenMail or Dovecot) for list/get/send and threading headers.
  • oauth — token exchange/refresh against a mocked authorization server; loopback capture logic unit-tested.
  • CLI — golden-file tests of the JSON envelope for representative success and error cases; assert no secret ever appears in output.

13. Open items / future work

  • Optional attachment size cap on get if payloads prove unwieldy.
  • Additional auth mechanisms (e.g. OAuth for non-Google providers) follow the same model.
  • Whitelist semantics are currently per-account only; global defaults with overrides are explicitly out of scope for v1.
  • No unack / re-process command in v1; the agent acks deliberately and acks are final (short of an admin reset). Add if a re-processing workflow proves necessary.