Changelog

0.2.2
Dec 12, 2025

Push-to-Dictation Mode, Hands-Free Mode & Hybrid Shortcut Manager

Added

Push-to-Dictation Mode: Press-to-Record, Release-to-Transcribe: Implemented push-to-dictation mode that starts recording when key is pressed and transcribes the complete audio when released. Full Audio Transcription: Performs complete transcription of the entire recorded segment when released, ensuring semantic completeness and context preservation. Automatic Text Selection: Automatically captures and inserts selected text on key release, enabling seamless text replacement workflows. Minimum Press Duration: Configurable minimum press duration (default 150ms) to prevent accidental activations. Audio Padding for Short Recordings: Automatically pads very short audio segments to ensure accurate transcription, similar to professional dictation tools.

Hands-Free Mode: Continuous Listening: Enables continuous audio monitoring with automatic segmentation and real-time transcription. Intelligent Auto-Segmentation: Automatically detects 500ms silence periods to segment speech into meaningful chunks for transcription. Real-Time Streaming Transcription: Provides immediate transcription results as each segment completes, enabling live conversation capture. Modifier-Key Activation: Supports modifier-only key combinations (e.g., ⌘ + ⌥) for quick hands-free mode activation without additional keys.

Hybrid Shortcut Manager: Fn Key Support: Added support for using Fn key as push-to-dictation shortcut through native CGEventTap monitoring. Modifier-Only Combinations: Enabled modifier-only key combinations (e.g., ⌘ + ⌥) for hands-free mode activation without requiring additional keys.

Modal Dialog State Management: Improved state management for settings modal dialogs, ensuring consistent behavior and better user experience across all configuration interfaces.

0.2.1
Dec 5, 2025

New Application Icon, Splash Page & Dictionary Feature

Added

New Application Icon: Updated Visual Identity: Implemented new application icon design with improved visual consistency. Enhanced Brand Recognition: Updated icon across all system locations including Dock, menu bar, and system preferences.

Splash Page & Onboarding Tour: First-Time User Experience: Added splash page and comprehensive onboarding tour for new users. Guided Setup Process: Step-by-step introduction to key features and functionality. Improved User Onboarding: Enhanced initial user experience with interactive tutorials.

Dictionary Feature in Sidebar: Unified Dictionary Access: Added Dictionary entry in sidebar that consolidates Vocabulary and Snippets management. Tabbed Interface: Implemented tabbed interface within Dictionary view for easy switching between Vocabulary and Snippets. Streamlined Navigation: Simplified sidebar navigation by grouping related features together.

Enhanced

Floating Action Button (FAB) UI: UI Transition: Switched to FAB-based user interface for improved accessibility and workflow. Enhanced Interaction: Improved user interaction patterns with FAB design.

Advanced Keyboard Shortcut Support: Fn Key for Push-to-Dictation: Added support for using Fn key as push-to-dictation shortcut. Modifier Key Combinations for Hands-Free Mode: Implemented support for modifier-only key combinations (e.g., ⌘ + ⌥) for hands-free mode activation. Flexible Shortcut Configuration: Enhanced shortcut system to support both single modifier keys and complex key combinations.

0.2.0
Nov 28, 2025

Live Transcription, OOTB Voice Model & Performance Optimization

Added

Live Transcription Support: Added live transcription display in HUD panel showing real-time transcription results as they are generated. Implemented streaming transcription updates that display incremental transcription results during recording.

Out-of-the-Box (OOTB) Voice Model: Implemented default voice model selection system that automatically uses a pre-configured model on first launch. Users can start using the application immediately without waiting for model downloads, providing a seamless first-time experience.

Enhanced

Microphone and VAD Performance Optimization: Improved native audio capture performance with better resource management and reduced latency. Minimized audio processing latency for faster transcription response times.

Sidebar & Header UI Optimization: Improved sidebar design with better visual hierarchy and smoother collapse/expand animations. Enhanced page header with integrated microphone device selector and theme switcher for quick access. Refined UI components with better spacing, typography, and visual consistency across the application.

0.1.8
Nov 21, 2025

Audio Model Testing, HUD Enhancements & LLM Streaming

Added

Audio Model Testing: Added support to test audio models directly within the application. Users can now verify model performance and accuracy before using in production workflows.

HUD Panel Enhancements: Added live audio waveform display in the HUD panel for visual feedback during recording. Implemented one-click copy functionality to quickly copy transcription content from the HUD panel. Enhanced HUD panel to show real-time transcription results directly in the panel interface.

LLM Streaming Support: Added support to display LLM streaming responses in real-time. Users can now see LLM responses as they are generated, improving interaction feedback.

Manual Update Check: Added manual update check functionality accessible from the sidebar. Users can now manually trigger update checks without waiting for automatic notifications.

Enhanced

Audio Device Detection Performance: Improved performance and responsiveness of audio device detection. Reduced latency when scanning and listing available audio input devices. Optimized device detection to minimize system resource usage.

Microphone Settings Page Refactoring: Refactored microphone settings page with better organization and user experience. Streamlined interface for selecting and configuring microphone devices. Improved visual design and information architecture for easier navigation.

0.1.7
Nov 14, 2025

Google Sign-In & Always-On-Top HUD Panel

Added

Google Account Sign-In Support: Implemented Google OAuth authentication flow with secure code exchange for seamless account integration.

Always-On-Top HUD Panel: Introduced a fully draggable HUD strip that can be repositioned anywhere on screen with elegant semi-transparent design that blends seamlessly with desktop content. The collapsible panel design features smooth expand/collapse animations, adjustable window sizes (Small, Medium, Large) for different use cases, and real-time opacity adjustment slider for customizing panel transparency. The non-activating design ensures the panel does not steal focus from other applications, maintaining workflow continuity.

HUD Light/Dark Theme Readability: Comprehensive Light/Dark theme support for all HUD components ensuring optimal visibility and readability across different system themes.

Deprecated

Floating Action Button (FAB): The Floating Action Button has been deprecated in favor of the new HUD panel. All FAB functionality has been integrated into the HUD panel with improved accessibility and features. The HUD panel provides a more native macOS experience with always-on-top capability and better visibility.

0.1.6
Oct 31, 2025

Support Vocabulary & Recording Metadata Upgrades

Vocabulary & Misspelling Toolkit: Customize domain-specific terminology lists and casing rules for precise transcriptions. Define common misspellings with automatic correction to reduce manual cleanup.

Recording History Metadata: Capture and display sessionId, requestId, and the associated preset for faster troubleshooting. Include the new metadata fields in exports to support downstream analytics.

0.1.5
Oct 31, 2025

User Profiles, Diagnostics & Header Experience

Added

User Profile Management: Introduced full profile display on the Profile page with name, gender, birth year, and profession details plus an editable form under Settings > Account. User Avatar Enhancement: Refined the avatar dropdown to highlight the full name with improved typography for clearer identity cues. Profile Data Synchronization: Added real-time loading and saving with consistent loading indicators and robust error handling.

Enhanced

Signed App Permission Validation: Hardened notarized build entitlement checks with actionable error messaging and telemetry for signature failures. Panic Hook Diagnostics: Expanded the panic hook to capture structured stack traces and thread metadata while surfacing crash summaries and auto-restarting background workers. Contextual Metadata Collection: Gathered richer runtime context—including foreground app, OS build, and hardware model—to improve crash and feedback payload quality. Header Layout Improvements: Added a microphone selector and integrated theme switcher directly into the header for faster access.

Updated

Settings Page Refinements: Added a birth-year dropdown, richer profession options, and unified loading indicators across Profile and Settings.

0.1.4
Oct 24, 2025

Search Functionality & Data Synchronization Improvements

Fix Search Functionality Issues

New Record Searchability: Fixed issue where newly added voice records were not searchable due to missing user_id filtering in search queries. User ID Synchronization: Enhanced user_id parsing and storage from JWT access tokens to ensure proper record association. Search Query Optimization: Improved search logic to correctly handle records with NULL user_id values while maintaining backward compatibility.

Enhance Data Synchronization & Cleanup

Orphaned Record Cleanup: Enhanced resync and reindex logic to automatically detect and remove orphaned database records that no longer have corresponding files. Cross-User Cleanup: Improved orphan cleanup to handle both authenticated and anonymous user records during synchronization. User ID Recovery: Added automatic user_id recovery for records that were incorrectly stored with NULL values by analyzing file system paths. Path Management Refactoring: Separated directory path retrieval from directory creation to prevent unintended directory creation during deletion operations.

0.1.3
Oct 17, 2025

Enhanced User Experience & Advanced Search Capabilities

Mode User Experience Improvements

Processing Flow Visualization: Added visual processing flow display when using modes, showing the complete pipeline from speech input to LLM processing or text output. Enhanced Mode Editing: Implemented comprehensive mode detail editing with click-to-edit functionality, allowing users to modify preset settings, voice models, LLM configurations, and advanced options.

Recording History Search & Filtering Enhancements

Full-Text Search: Implemented comprehensive full-text search across recording history, supporting search in transcriptions, titles, and LLM-generated content. Performance Optimization: Enhanced search performance with optimized query execution. Advanced Filtering: Added status-based filtering (completed, processing, error) with filter application. History Re-indexing: Added manual re-indexing functionality to rebuild search indexes and sync file system records with database. Infinite Scroll: Implemented seamless infinite scrolling for recording history with automatic pagination, reducing initial load time and improving user experience for large datasets.

New Application Icons

Dock Icon: Updated macOS Dock icon with new design and improved visual consistency. System Tray Icon: Enhanced system tray icon with better visibility and template support. High-Resolution Assets: Included @2x and @3x variants for Retina displays and various screen densities.

0.1.2
Oct 10, 2025

Enhanced Infrastructure & System-wide Integration

Enhanced Voice Model Download Infrastructure

Mirror Support: Added alternative download mirrors for improved reliability and speed. Faster Downloads: Optimized download performance with multiple mirror sources. Better Availability: Reduced download failures with redundant mirror support.

Expanded CyberWhisper Cloud LLM Model Support

Complete Model Catalog: Full access to all CyberWhisper Cloud models through REST API. Dynamic Model Loading: Real-time model list fetching from CyberWhisper Cloud API. Enhanced Model Selection: Support for 6+ models including CyberWhisper Fast (ultra-fast response model for real-time conversations), CyberWhisper Flash (lightning-fast model for simple tasks), GPT-5 Nano (OpenAI's latest lightweight model with balanced performance), GPT-4o Mini (efficient OpenAI model for daily tasks), DeepSeek V3.1 (advanced reasoning capabilities with latest DeepSeek technology), and Gemini 2.5 Flash Lite (Google's ultra-fast lightweight model for real-time applications).

System Tray Integration

Mode Selection: Quick mode switching directly from system tray. Microphone Management: Easy microphone device selection from tray menu. System-wide Access: Control CyberWhisper from anywhere in your system. Quick Actions: Essential functions accessible without opening the main window.

0.1.1
Oct 3, 2025

Command Palette & Global Shortcuts

Command Palette for Mode Management

Smart Mode Search: Search and activate modes using intelligent keyword matching. Multi-criteria Filtering: Find modes by preset names, voice models, LLM models, or descriptions. Detailed Mode Information: Display comprehensive mode details including preset, voice model, LLM model, input/output languages, and feature settings. Global Shortcut Access: Open Command Palette from anywhere with ⌘ + ⇧ + K. Fuzzy Search: Intelligent search that matches partial keywords and related terms. Real-time Filtering: Instant results as you type with live mode filtering. Visual Mode Status: Clear indication of active vs inactive modes with status badges. Keyboard Navigation: Full keyboard support with arrow keys and Enter to activate.

Global Keyboard Shortcuts

Added support for customizable global keyboard shortcuts. Start Dictation: Toggle recording with customizable shortcut (default: ⌥ + N). Cancel Recording: Cancel ongoing recording with Esc key. Change Mode: Quick mode switching with global shortcut (default: ⌘ + ⇧ + K). Shortcuts work system-wide, even when the app is not in focus. Fallback shortcut registration for better compatibility across different systems.

Enhanced Floating Action Button

Position Optimization: Moved FAB closer to macOS Dock for better accessibility. Visual Refinements: Enhanced button appearance with improved hover effects. Circular Hover Area: Precise circular hover detection using backend mouse tracking. Smooth Animations: Improved transition effects and visual feedback. Size Optimization: Adjusted button dimensions for better visual balance. Border Enhancement: Added subtle border for better visibility across different backgrounds.

Technical Improvements

Backend Mouse Tracking: Implemented precise circular hover detection using Rust backend. Coordinate Synchronization: Ensured backend coordinates match frontend CSS perfectly. Hover Area Fix: Resolved hover detection issues with accurate circular boundary. Performance: Optimized mouse tracking for better responsiveness. Command Palette Architecture: Built with cmdk library for robust command palette functionality. Advanced Search Engine: Implemented fuzzy search with multi-criteria filtering for mode discovery. Shortcut Validation: Added comprehensive validation to prevent single-key shortcuts that could interfere with system behavior. Backend Integration: Seamless integration between frontend Command Palette and Rust backend for mode activation. Enhanced UI Components: Rich mode information display with status badges.

0.1
Sep 26, 2025

Download Base Speech Models, Modes & Presets, and History Viewer

Download Base Speech Models

Support for downloading and running basic on-device speech-to-text models. This feature enables offline transcription capabilities and improved privacy for users who prefer to keep their audio data local.

Modes & Presets

Introduced Presets for Voice to Text and Message workflows, making it easier to configure transcription and usage modes. Users can now quickly switch between different processing modes without manual configuration.

History Viewer

Access and review your past transcriptions and interactions directly within the app. The History Viewer provides a comprehensive timeline of all your speech-to-text activities, making it easy to find and reference previous work.

    CyberWhisper