- Features table: add Voice Modes row - Voice Modes section: 4-mode table (Man/Woman/Child/Old) with semitone values, perceptual descriptions, and future formant_scale note - DSP chain: add Pitch Shift stage after De-Esser - Architecture table: mention pitch shift (WSOLA) in AudioEngine description - Repository layout: add VoiceMode.swift, VoiceModePickerView.swift |
||
|---|---|---|
| .github/workflows | ||
| AudioEngine | ||
| docs | ||
| scripts | ||
| shared | ||
| VirtualDriver | ||
| VoiceEnhancerApp | ||
| .gitignore | ||
| LICENSE | ||
| PRIVACY.md | ||
| README.md | ||
Voice Enhancer
Sound like a pro on every call.
A real-time voice processor for macOS that turns your microphone into a broadcast-quality virtual mic.
Zoom, Teams, Meet, Discord, OBS — just pick "Voice Enhancer" and talk.
Why Voice Enhancer?
Most people sound flat, muddy, or harsh on calls — not because of bad mics, but because raw audio isn't processed. Voice Enhancer fixes that in real time, with zero setup:
|
Without Voice Enhancer
|
With Voice Enhancer
|
Features
| | Feature | Description | ||:---:|---|---| | 4 | Voice Presets | Natural, Broadcast, Clarity, Warm — one click to sound great | | 4 | Voice Modes | Man, Woman, Child, Old — real-time pitch transformation | | ~10ms | Latency | Real-time processing with no perceptible delay | | 0% | CPU Impact | 512 frames processed in under 50μs on Apple Silicon | | 100% | Local | No cloud, no network calls, no telemetry, no AI — your voice stays on your Mac |
Live Tuning — Adjustable compression and de-essing sliders. Dial in exactly what works for your voice, mid-call.
Voice Preview — Record a 3-second clip, tweak the sliders, hear the result on your own voice instantly.
Real-Time Meters — Input/output levels plus compressor and de-esser gain reduction, so you can see exactly what the DSP is doing.
Virtual Microphone — Shows up as "Voice Enhancer" in any app's mic picker. No plugins, no extensions, no per-app config.
Presets
NaturalGentle cleanup, transparentGood mics, quiet rooms |
BroadcastRadio DJ voice, bold and clearPodcasts, presentations |
ClarityPresence boost, cuts throughQuiet or distant mics |
WarmAdds body, softens harshnessThin or harsh-sounding mics |
Voice Modes
Transform your voice in real-time. Voice modes shift pitch independently of the quality preset — use a Broadcast preset with a Woman voice mode, for example.
All shifts stay within ±8 semitones so the WSOLA duration compensation adds no more than ~10% tempo change — unnoticeable in live calls.
Man−3 st — deeper, authoritativeDrops into the typical male fundamental range (85–180 Hz) |
Woman+4 st — brighter, openShifts into the typical female fundamental range (165–300 Hz) |
Child+8 st — smaller, brighterPushes into the child fundamental range (250–600 Hz) |
Old−2 st — warmer, softerSlightly lower with reduced high-frequency harshness |
Note: v1 voice modes control only the pitch shift parameter. A future update will add formant scaling to shift vocal tract resonances independently of the fundamental, enabling more convincing gender/age transformations.
Hear the Difference
Same voice, same mic — just a different preset. Listen for yourself:
Listen on SoundCloud — Original (unprocessed) vs. Natural, Broadcast, Clarity, and Warm.
Quick Start
Install via Homebrew
brew tap aheadly-tech/tap
brew install --cask voice-enhancer
Use it
- Open Voice Enhancer
- Grant microphone permission when prompted
- Pick a preset
- In your meeting app, select "Voice Enhancer" as the microphone
- That's it — you sound better now
Uninstall
brew uninstall --cask voice-enhancer
Build from source
Prerequisites: Xcode 15+, CMake 3.20+, XcodeGen
brew install cmake xcodegen
./scripts/build.sh
sudo ./scripts/install.sh
open VoiceEnhancerApp/build/Build/Products/Release/Voice\ Enhancer.app
To uninstall a source build: sudo ./scripts/uninstall.sh
How It Works
Your voice goes in raw, comes out polished. The DSP chain runs in this order:
Mic Input → HPF → Compressor → 4-Band Parametric EQ → De-Esser → Pitch Shift → Limiter → Virtual Mic Output
The Pitch Shift stage uses WSOLA (Waveform Similarity Overlap-Add) to shift pitch without changing duration — essential for real-time calls where natural timing matters.
The app and virtual driver communicate through a lock-free shared-memory ring buffer — no XPC, no kernel extensions, no IPC frameworks. Just fast, direct memory.
Architecture
Three components, cleanly separated:
| Component | Language | Role | ||:---|:---:|:---| | AudioEngine | C++17 | DSP library — HPF, compressor, parametric EQ, de-esser, pitch shift (WSOLA), limiter. Real-time safe, unit-tested. | | VirtualDriver | C++ | Core Audio HAL plugin. Registers as a system microphone. | | VoiceEnhancerApp | Swift / SwiftUI | User interface, mic capture, parameter control. Talks to engine via C ABI. |
IPC: App and driver share audio through a POSIX shared-memory lock-free SPSC ring buffer (shared/RingBuffer.h).
Repository layout
├── AudioEngine/ C++ DSP library (CMake)
│ ├── include/ Public headers
│ ├── src/ Implementation
│ ├── bridge/ C ABI for Swift interop
│ └── tests/ Unit tests
├── VirtualDriver/ Core Audio HAL plugin
│ └── src/ Plugin, Device, Stream implementations
├── VoiceEnhancerApp/ SwiftUI macOS application
│ └── Sources/
│ ├── App/ App entry point
│ ├── Audio/ Mic capture, engine bridge, voice preview
│ ├── Models/ Preset, VoiceMode, AudioDevice
│ ├── ViewModels/ AudioViewModel
│ ├── Views/ ContentView, Settings, Meters, Presets, VoiceModePicker
│ └── Resources/ Assets, entitlements, Info.plist
├── shared/ Ring buffer (shared by app + driver)
├── scripts/ Build, install, notarize, uninstall
└── docs/ Architecture, build guide, contributing
Performance
The audio callback is engineered for absolute real-time safety:
- Fast math — IEEE 754 bit-trick approximations replace per-sample transcendental functions (~3.8x faster than
std::log10/std::exp) - Zero allocations — No
malloc, no locks, no Swift runtime calls on the audio thread - Engine-managed conversion — AVAudioEngine handles sample rate conversion in its own RT-optimized graph
- Large ring buffer — 16,384 frames (~341ms at 48 kHz) absorbs macOS scheduling jitter without underruns
FAQ
"Voice Enhancer" doesn't show up as a microphone
The HAL driver needs to be installed:
sudo ./scripts/install.sh
This copies the driver to /Library/Audio/Plug-Ins/HAL/ and restarts coreaudiod. You may hear a brief audio interruption — that's normal.
Can I use this during a live call?
Yes. The app processes audio continuously. You can change presets and adjust sliders mid-call. The Voice Preview feature outputs to your speakers/headphones, not the virtual mic, so it won't leak into the meeting.
Does this work with AirPods / Bluetooth mics?
Yes. Voice Enhancer reads from whatever input device you select (or the system default). If macOS sees it as a microphone, Voice Enhancer can process it.
What's the CPU usage?
Negligible. The DSP chain processes 512 frames (~10ms) in under 50μs on Apple Silicon. You won't see it in Activity Monitor.
Contributing
Issues and pull requests are welcome. Please read docs/CONTRIBUTING.md and docs/ARCHITECTURE.md first.
Privacy
Privacy Policy — Voice Enhancer runs 100% offline. No data collection, no telemetry, no network calls.
License
Apache 2.0 — use it, fork it, ship it. Just give credit and note your changes.