Render voice-mode presets (Dry/Man/Woman/Child/Old) against a sample and emit an HTML A/B audition page. ffmpeg-only.

Find a file

hermes af7193f1d5 Initial MVP commit		2026-05-11 03:04:08 -04:00
src/preset_audition	Initial MVP commit	2026-05-11 03:04:08 -04:00
tests	Initial MVP commit	2026-05-11 03:04:08 -04:00
.gitignore	Initial MVP commit	2026-05-11 03:04:08 -04:00
LICENSE	Initial MVP commit	2026-05-11 03:04:08 -04:00
pyproject.toml	Initial MVP commit	2026-05-11 03:04:08 -04:00
README.md	Initial MVP commit	2026-05-11 03:04:08 -04:00

README.md

preset-audition

Render the voice-mode-preview presets (Dry / Man −3 st / Woman +4 st / Child +8 st / Old −2 st) against any sample audio and emit a self-contained HTML A/B comparison page with embedded <audio> players.

Built for the Catacolabs/voice-enhancer real-time macOS voice DSP project — closes the long-standing "presets un-auditioned" blocker by giving you a one-shot way to actually hear them on your own voice before locking the v1.1.0 chain.

Why this project

Five consecutive Daily MVP Builder cron summaries (May 3, 5, 6, 7, 9, 10) flagged the same dormant item: the four voice-mode presets shipped in voice-mode-preview on May 3 have been un-auditioned for 8+ days. The May 10 cron (blackhole-doctor) closed the routing-detection half; this one closes the audition half. Record a 10-second voice clip, run preset-audition sample.wav, open audition/index.html, pick the preset that doesn't sound silly, and wire it into the C++ chain — no more guesswork.

Install

pip install -e .
# requires ffmpeg on PATH:
brew install ffmpeg

Pure stdlib + ffmpeg. No numpy, no Pillow, no librubberband — pitch shift uses the classic asetrate+aresample+atempo combo which ships in every Homebrew ffmpeg build.

Usage

# Render all 5 presets against sample.wav
preset-audition sample.wav

# Pick a subset
preset-audition sample.wav --presets man,woman,child

# Custom output dir
preset-audition sample.wav -o ~/Desktop/audition

# List available presets
preset-audition --list

The output dir contains one <stem>__<preset>.wav per preset plus an index.html you can open with open audition/index.html.

Presets

Name	Label	Pitch	Highpass	Lowpass	Gain
dry	Dry (no DSP)	0 st	—	—	0 dB
man	Man	−3 st	80 Hz	8000 Hz	0 dB
woman	Woman	+4 st	120 Hz	9000 Hz	−1 dB
child	Child	+8 st	150 Hz	9500 Hz	−2 dB
old	Old	−2 st	90 Hz	6500 Hz	0 dB

Order of operations mirrors the voice-enhancer chain: HPF → PitchShift → LPF → gain. (Compressor / DeEsser / Limiter not included — they'd require offline plugin hosts; goal here is to audition tonal character, not full mastering.)

Tests

pip install -e .[dev]
pytest -q

License

MIT.

README.md Unescape Escape