No description

Find a file

Luis f59759646c Some checks failed CI / Engine (Linux) (push) Failing after 3s Details CI / Driver + App (macOS) (push) Has been skipped Details docs: document voice modes in README - Features table: add Voice Modes row - Voice Modes section: 4-mode table (Man/Woman/Child/Old) with semitone values, perceptual descriptions, and future formant_scale note - DSP chain: add Pitch Shift stage after De-Esser - Architecture table: mention pitch shift (WSOLA) in AudioEngine description - Repository layout: add VoiceMode.swift, VoiceModePickerView.swift		2026-04-26 20:30:29 -04:00
.github/workflows	Voice Enhancer v1.0.0	2026-04-24 21:33:49 +02:00
AudioEngine	feat(AudioEngine): add real-time voice mode presets (Man/Woman/Child/Old)	2026-04-26 20:28:06 -04:00
docs	Revert "Add audio samples for all 4 presets and render-file tool"	2026-04-25 23:28:27 +02:00
scripts	Voice Enhancer v1.0.0	2026-04-24 21:33:49 +02:00
shared	Voice Enhancer v1.0.0	2026-04-24 21:33:49 +02:00
VirtualDriver	Voice Enhancer v1.0.0	2026-04-24 21:33:49 +02:00
VoiceEnhancerApp	feat(AudioEngine): add real-time voice mode presets (Man/Woman/Child/Old)	2026-04-26 20:28:06 -04:00
.gitignore	Voice Enhancer v1.0.0	2026-04-24 21:33:49 +02:00
LICENSE	Change license from MIT to Apache 2.0	2026-04-24 21:48:56 +02:00
PRIVACY.md	Add privacy policy	2026-04-25 23:04:58 +02:00
README.md	docs: document voice modes in README	2026-04-26 20:30:29 -04:00

README.md

Voice Enhancer

Sound like a pro on every call.

A real-time voice processor for macOS that turns your microphone into a broadcast-quality virtual mic.
Zoom, Teams, Meet, Discord, OBS — just pick "Voice Enhancer" and talk.

Why Voice Enhancer?

Most people sound flat, muddy, or harsh on calls — not because of bad mics, but because raw audio isn't processed. Voice Enhancer fixes that in real time, with zero setup:

Without Voice Enhancer

Thin, distant-sounding voice
Background hum and rumble
Harsh sibilance ("s" sounds that bite)
Volume jumps when you lean in or away

With Voice Enhancer

Full, present, broadcast-quality voice
Clean low-end, no rumble
Smooth, de-essed highs
Consistent volume, always audible

Features

| | Feature | Description | ||:---:|---|---| | 4 | Voice Presets | Natural, Broadcast, Clarity, Warm — one click to sound great | | 4 | Voice Modes | Man, Woman, Child, Old — real-time pitch transformation | | ~10ms | Latency | Real-time processing with no perceptible delay | | 0% | CPU Impact | 512 frames processed in under 50μs on Apple Silicon | | 100% | Local | No cloud, no network calls, no telemetry, no AI — your voice stays on your Mac |

Live Tuning — Adjustable compression and de-essing sliders. Dial in exactly what works for your voice, mid-call.

Voice Preview — Record a 3-second clip, tweak the sliders, hear the result on your own voice instantly.

Real-Time Meters — Input/output levels plus compressor and de-esser gain reduction, so you can see exactly what the DSP is doing.

Virtual Microphone — Shows up as "Voice Enhancer" in any app's mic picker. No plugins, no extensions, no per-app config.

Presets

Natural

Gentle cleanup, transparent

Good mics, quiet rooms

Broadcast

Radio DJ voice, bold and clear

Podcasts, presentations

Clarity

Presence boost, cuts through

Quiet or distant mics

Warm

Adds body, softens harshness

Thin or harsh-sounding mics

Voice Modes

Transform your voice in real-time. Voice modes shift pitch independently of the quality preset — use a Broadcast preset with a Woman voice mode, for example.

All shifts stay within ±8 semitones so the WSOLA duration compensation adds no more than ~10% tempo change — unnoticeable in live calls.

Man

−3 st — deeper, authoritative

Drops into the typical male fundamental range (85–180 Hz)

Woman

+4 st — brighter, open

Shifts into the typical female fundamental range (165–300 Hz)

Child

+8 st — smaller, brighter

Pushes into the child fundamental range (250–600 Hz)

Old

−2 st — warmer, softer

Slightly lower with reduced high-frequency harshness

Note: v1 voice modes control only the pitch shift parameter. A future update will add formant scaling to shift vocal tract resonances independently of the fundamental, enabling more convincing gender/age transformations.

Hear the Difference

Same voice, same mic — just a different preset. Listen for yourself:

Listen on SoundCloud — Original (unprocessed) vs. Natural, Broadcast, Clarity, and Warm.

Quick Start

Install via Homebrew

brew tap aheadly-tech/tap
brew install --cask voice-enhancer

Use it

Open Voice Enhancer
Grant microphone permission when prompted
Pick a preset
In your meeting app, select "Voice Enhancer" as the microphone
That's it — you sound better now

Uninstall

brew uninstall --cask voice-enhancer

Build from source

Prerequisites: Xcode 15+, CMake 3.20+, XcodeGen

brew install cmake xcodegen
./scripts/build.sh
sudo ./scripts/install.sh
open VoiceEnhancerApp/build/Build/Products/Release/Voice\ Enhancer.app

To uninstall a source build: sudo ./scripts/uninstall.sh

How It Works

Your voice goes in raw, comes out polished. The DSP chain runs in this order:

Mic Input → HPF → Compressor → 4-Band Parametric EQ → De-Esser → Pitch Shift → Limiter → Virtual Mic Output

The Pitch Shift stage uses WSOLA (Waveform Similarity Overlap-Add) to shift pitch without changing duration — essential for real-time calls where natural timing matters.

The app and virtual driver communicate through a lock-free shared-memory ring buffer — no XPC, no kernel extensions, no IPC frameworks. Just fast, direct memory.

Architecture

Three components, cleanly separated:

| Component | Language | Role | ||:---|:---:|:---| | AudioEngine | C++17 | DSP library — HPF, compressor, parametric EQ, de-esser, pitch shift (WSOLA), limiter. Real-time safe, unit-tested. | | VirtualDriver | C++ | Core Audio HAL plugin. Registers as a system microphone. | | VoiceEnhancerApp | Swift / SwiftUI | User interface, mic capture, parameter control. Talks to engine via C ABI. |

IPC: App and driver share audio through a POSIX shared-memory lock-free SPSC ring buffer (shared/RingBuffer.h).

Repository layout

├── AudioEngine/             C++ DSP library (CMake)
│   ├── include/             Public headers
│   ├── src/                 Implementation
│   ├── bridge/              C ABI for Swift interop
│   └── tests/               Unit tests
├── VirtualDriver/           Core Audio HAL plugin
│   └── src/                 Plugin, Device, Stream implementations
├── VoiceEnhancerApp/        SwiftUI macOS application
│   └── Sources/
│       ├── App/             App entry point
│       ├── Audio/           Mic capture, engine bridge, voice preview
│       ├── Models/          Preset, VoiceMode, AudioDevice
│       ├── ViewModels/      AudioViewModel
│       ├── Views/           ContentView, Settings, Meters, Presets, VoiceModePicker
│       └── Resources/       Assets, entitlements, Info.plist
├── shared/                  Ring buffer (shared by app + driver)
├── scripts/                 Build, install, notarize, uninstall
└── docs/                    Architecture, build guide, contributing

Performance

The audio callback is engineered for absolute real-time safety:

Fast math — IEEE 754 bit-trick approximations replace per-sample transcendental functions (~3.8x faster than std::log10/std::exp)
Zero allocations — No malloc, no locks, no Swift runtime calls on the audio thread
Engine-managed conversion — AVAudioEngine handles sample rate conversion in its own RT-optimized graph
Large ring buffer — 16,384 frames (~341ms at 48 kHz) absorbs macOS scheduling jitter without underruns

FAQ

"Voice Enhancer" doesn't show up as a microphone

The HAL driver needs to be installed:

sudo ./scripts/install.sh

This copies the driver to /Library/Audio/Plug-Ins/HAL/ and restarts coreaudiod. You may hear a brief audio interruption — that's normal.

Can I use this during a live call?

Yes. The app processes audio continuously. You can change presets and adjust sliders mid-call. The Voice Preview feature outputs to your speakers/headphones, not the virtual mic, so it won't leak into the meeting.

Does this work with AirPods / Bluetooth mics?

Yes. Voice Enhancer reads from whatever input device you select (or the system default). If macOS sees it as a microphone, Voice Enhancer can process it.

What's the CPU usage?

Negligible. The DSP chain processes 512 frames (~10ms) in under 50μs on Apple Silicon. You won't see it in Activity Monitor.

Contributing

Issues and pull requests are welcome. Please read docs/CONTRIBUTING.md and docs/ARCHITECTURE.md first.

Privacy

License

Apache 2.0 — use it, fork it, ship it. Just give credit and note your changes.

Built by Aheadly Tech

Stop sounding like a webcam mic. Start sounding like a studio.

README.md Unescape Escape