5 Commits

Author SHA1 Message Date
steve daa1542672 Add input device selection to interactive config TUI
Enumerates available audio input devices via cpal and presents them
in a dropdown, with "System default" as the first option.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 09:21:44 +01:00
steve 3fa4d102df v0.2.1: Fix cancel key (Escape) being swallowed globally
The cancel key was consumed by rdev::grab at all times, not just during
recording/transcribing. This made the Escape key unusable system-wide
while Mouth was running. Now the cancel key only gets swallowed when
Mouth is actively recording or transcribing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:37:50 +01:00
steve b637432dce Fix warn macro import for Linux build, add cargo PATH to release script
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 22:06:58 +01:00
steve 0cea6a4b28 v0.2.0: System tray, IPC status, VAD, hotkey grab, and polish
- Add system tray icon with Exit menu (tray-icon/muda)
- Add IPC daemon status via named pipe (Windows) / Unix socket (Linux)
- Add `mouth status` command to query running daemon
- Add daemon lock to prevent multiple instances
- Hide Windows console window when running as daemon
- Wire up Silero VAD model download and speech filtering
- Switch hotkey listener from rdev::listen to rdev::grab to consume hotkeys
- Add hotkey capture mode in interactive config (press keys instead of typing)
- Add all missing key names (brackets, punctuation, numpad, etc.)
- Fix ONNX tensor type mismatches (encoder wants i64, decoder wants i32)
- Add 300ms lead-in silence to compensate for mic startup latency
- Add 300ms trailing recording after stop for speech not to be clipped
- Add 50ms silence before audio feedback blips for device warmup
- Reduce overlay size (150x18, was 200x36)
- Add PolyForm Noncommercial 1.0.0 license
- Flesh out user-focused README
- Update release script with Gitea/GitHub forge support

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 22:04:39 +01:00
steve 9b0bf7d9e3 Implement core speech-to-text pipeline
All major components: hotkey listener (rdev), audio capture (cpal),
resampling (rubato), VAD (Silero ONNX), Parakeet v3 TDT transcription
(ort), overlay window (winit+softbuffer), paste simulation (enigo+arboard),
audio feedback (rodio), YAML config, CLI with clap, HuggingFace model
download. ~2400 lines of Rust across 16 source files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:47:46 +01:00