No description
Find a file
Luis f59759646c
Some checks failed
CI / Engine (Linux) (push) Failing after 3s
CI / Driver + App (macOS) (push) Has been skipped
docs: document voice modes in README
- Features table: add Voice Modes row
- Voice Modes section: 4-mode table (Man/Woman/Child/Old) with
  semitone values, perceptual descriptions, and future formant_scale note
- DSP chain: add Pitch Shift stage after De-Esser
- Architecture table: mention pitch shift (WSOLA) in AudioEngine description
- Repository layout: add VoiceMode.swift, VoiceModePickerView.swift
2026-04-26 20:30:29 -04:00
.github/workflows Voice Enhancer v1.0.0 2026-04-24 21:33:49 +02:00
AudioEngine feat(AudioEngine): add real-time voice mode presets (Man/Woman/Child/Old) 2026-04-26 20:28:06 -04:00
docs Revert "Add audio samples for all 4 presets and render-file tool" 2026-04-25 23:28:27 +02:00
scripts Voice Enhancer v1.0.0 2026-04-24 21:33:49 +02:00
shared Voice Enhancer v1.0.0 2026-04-24 21:33:49 +02:00
VirtualDriver Voice Enhancer v1.0.0 2026-04-24 21:33:49 +02:00
VoiceEnhancerApp feat(AudioEngine): add real-time voice mode presets (Man/Woman/Child/Old) 2026-04-26 20:28:06 -04:00
.gitignore Voice Enhancer v1.0.0 2026-04-24 21:33:49 +02:00
LICENSE Change license from MIT to Apache 2.0 2026-04-24 21:48:56 +02:00
PRIVACY.md Add privacy policy 2026-04-25 23:04:58 +02:00
README.md docs: document voice modes in README 2026-04-26 20:30:29 -04:00

Voice Enhancer icon



Voice Enhancer

Sound like a pro on every call.

License: Apache 2.0 macOS 13+ Apple Silicon + Intel Latency Hear the Difference


A real-time voice processor for macOS that turns your microphone into a broadcast-quality virtual mic.
Zoom, Teams, Meet, Discord, OBS — just pick "Voice Enhancer" and talk.


Voice Enhancer demo



Why Voice Enhancer?

Most people sound flat, muddy, or harsh on calls — not because of bad mics, but because raw audio isn't processed. Voice Enhancer fixes that in real time, with zero setup:

Without Voice Enhancer

  • Thin, distant-sounding voice
  • Background hum and rumble
  • Harsh sibilance ("s" sounds that bite)
  • Volume jumps when you lean in or away

With Voice Enhancer

  • Full, present, broadcast-quality voice
  • Clean low-end, no rumble
  • Smooth, de-essed highs
  • Consistent volume, always audible

Features

| | Feature | Description | ||:---:|---|---| | 4 | Voice Presets | Natural, Broadcast, Clarity, Warm — one click to sound great | | 4 | Voice Modes | Man, Woman, Child, Old — real-time pitch transformation | | ~10ms | Latency | Real-time processing with no perceptible delay | | 0% | CPU Impact | 512 frames processed in under 50μs on Apple Silicon | | 100% | Local | No cloud, no network calls, no telemetry, no AI — your voice stays on your Mac |


Live Tuning — Adjustable compression and de-essing sliders. Dial in exactly what works for your voice, mid-call.

Voice Preview — Record a 3-second clip, tweak the sliders, hear the result on your own voice instantly.

Real-Time Meters — Input/output levels plus compressor and de-esser gain reduction, so you can see exactly what the DSP is doing.

Virtual Microphone — Shows up as "Voice Enhancer" in any app's mic picker. No plugins, no extensions, no per-app config.


Presets


Natural

Gentle cleanup, transparent

Good mics, quiet rooms


Broadcast

Radio DJ voice, bold and clear

Podcasts, presentations


Clarity

Presence boost, cuts through

Quiet or distant mics


Warm

Adds body, softens harshness

Thin or harsh-sounding mics


Voice Modes

Transform your voice in real-time. Voice modes shift pitch independently of the quality preset — use a Broadcast preset with a Woman voice mode, for example.

All shifts stay within ±8 semitones so the WSOLA duration compensation adds no more than ~10% tempo change — unnoticeable in live calls.


Man

3 st — deeper, authoritative

Drops into the typical male fundamental range (85180 Hz)


Woman

+4 st — brighter, open

Shifts into the typical female fundamental range (165300 Hz)


Child

+8 st — smaller, brighter

Pushes into the child fundamental range (250600 Hz)


Old

2 st — warmer, softer

Slightly lower with reduced high-frequency harshness

Note: v1 voice modes control only the pitch shift parameter. A future update will add formant scaling to shift vocal tract resonances independently of the fundamental, enabling more convincing gender/age transformations.


Hear the Difference

Same voice, same mic — just a different preset. Listen for yourself:

Listen on SoundCloud — Original (unprocessed) vs. Natural, Broadcast, Clarity, and Warm.




Quick Start

Install via Homebrew

brew tap aheadly-tech/tap
brew install --cask voice-enhancer

Use it

  1. Open Voice Enhancer
  2. Grant microphone permission when prompted
  3. Pick a preset
  4. In your meeting app, select "Voice Enhancer" as the microphone
  5. That's it — you sound better now

Uninstall

brew uninstall --cask voice-enhancer
Build from source

Prerequisites: Xcode 15+, CMake 3.20+, XcodeGen

brew install cmake xcodegen
./scripts/build.sh
sudo ./scripts/install.sh
open VoiceEnhancerApp/build/Build/Products/Release/Voice\ Enhancer.app

To uninstall a source build: sudo ./scripts/uninstall.sh




How It Works

Voice Enhancer architecture diagram

Your voice goes in raw, comes out polished. The DSP chain runs in this order:

Mic Input → HPF → Compressor → 4-Band Parametric EQ → De-Esser → Pitch Shift → Limiter → Virtual Mic Output

The Pitch Shift stage uses WSOLA (Waveform Similarity Overlap-Add) to shift pitch without changing duration — essential for real-time calls where natural timing matters.

The app and virtual driver communicate through a lock-free shared-memory ring buffer — no XPC, no kernel extensions, no IPC frameworks. Just fast, direct memory.


Architecture

Three components, cleanly separated:

| Component | Language | Role | ||:---|:---:|:---| | AudioEngine | C++17 | DSP library — HPF, compressor, parametric EQ, de-esser, pitch shift (WSOLA), limiter. Real-time safe, unit-tested. | | VirtualDriver | C++ | Core Audio HAL plugin. Registers as a system microphone. | | VoiceEnhancerApp | Swift / SwiftUI | User interface, mic capture, parameter control. Talks to engine via C ABI. |

IPC: App and driver share audio through a POSIX shared-memory lock-free SPSC ring buffer (shared/RingBuffer.h).

Repository layout
├── AudioEngine/             C++ DSP library (CMake)
│   ├── include/             Public headers
│   ├── src/                 Implementation
│   ├── bridge/              C ABI for Swift interop
│   └── tests/               Unit tests
├── VirtualDriver/           Core Audio HAL plugin
│   └── src/                 Plugin, Device, Stream implementations
├── VoiceEnhancerApp/        SwiftUI macOS application
│   └── Sources/
│       ├── App/             App entry point
│       ├── Audio/           Mic capture, engine bridge, voice preview
│       ├── Models/          Preset, VoiceMode, AudioDevice
│       ├── ViewModels/      AudioViewModel
│       ├── Views/           ContentView, Settings, Meters, Presets, VoiceModePicker
│       └── Resources/       Assets, entitlements, Info.plist
├── shared/                  Ring buffer (shared by app + driver)
├── scripts/                 Build, install, notarize, uninstall
└── docs/                    Architecture, build guide, contributing

Performance

The audio callback is engineered for absolute real-time safety:

  • Fast math — IEEE 754 bit-trick approximations replace per-sample transcendental functions (~3.8x faster than std::log10/std::exp)
  • Zero allocations — No malloc, no locks, no Swift runtime calls on the audio thread
  • Engine-managed conversion — AVAudioEngine handles sample rate conversion in its own RT-optimized graph
  • Large ring buffer — 16,384 frames (~341ms at 48 kHz) absorbs macOS scheduling jitter without underruns



FAQ

"Voice Enhancer" doesn't show up as a microphone

The HAL driver needs to be installed:

sudo ./scripts/install.sh

This copies the driver to /Library/Audio/Plug-Ins/HAL/ and restarts coreaudiod. You may hear a brief audio interruption — that's normal.

Can I use this during a live call?

Yes. The app processes audio continuously. You can change presets and adjust sliders mid-call. The Voice Preview feature outputs to your speakers/headphones, not the virtual mic, so it won't leak into the meeting.

Does this work with AirPods / Bluetooth mics?

Yes. Voice Enhancer reads from whatever input device you select (or the system default). If macOS sees it as a microphone, Voice Enhancer can process it.

What's the CPU usage?

Negligible. The DSP chain processes 512 frames (~10ms) in under 50μs on Apple Silicon. You won't see it in Activity Monitor.




Contributing

Issues and pull requests are welcome. Please read docs/CONTRIBUTING.md and docs/ARCHITECTURE.md first.


Privacy

Privacy Policy — Voice Enhancer runs 100% offline. No data collection, no telemetry, no network calls.

License

Apache 2.0 — use it, fork it, ship it. Just give credit and note your changes.




Built by Aheadly Tech

Stop sounding like a webcam mic. Start sounding like a studio.