golden-voice
Clone your voice, speak anything, pay nothing
What it does
Record 30 seconds of yourself talking. golden-voice clones your voice locally using XTTS v2 and gives you a one-command TTS that sounds like you. No cloud, no API keys, no monthly bill.
golden-voice setup my-voice-sample.wav
golden-voice speak "The deployment looks clean. Ship it."
cat summary.md | golden-voice speak --stdin Why it exists
We were paying ElevenLabs $100/month for voice narration in our dev tools. It burned through quota fast with multiple sessions running, the quality was good but the cost didn't scale, and we wanted something we owned.
So we built it. One afternoon, zero budget. Now every developer on the team has their own voice clone running on their laptop.
How it works
- You record 15-30 seconds of yourself talking naturally
- XTTS v2 (open source, from Coqui) analyzes your voice โ pitch, cadence, timbre
- Any text you pipe in gets synthesized in your voice, locally, on CPU
- Optional sox effects โ reverb, chorus, echo for flavor
Generation takes ~15-20 seconds on Apple Silicon. For instant feedback, use --quick which falls back to macOS say with warm voices.
Install
git clone https://github.com/goldenfocus/golden-cloud.git
cd golden-cloud/blocks/golden-voice
bash install.sh Quality vs sample length
Requirements
- macOS (Apple Silicon or Intel)
- Python 3.11 (installer handles via brew)
- ~2GB disk for the XTTS model (downloaded once)
- A 15-30 second recording of your voice
A GoldenFocus block
Part of the GoldenFocus ecosystem โ tools that fix the world piece by piece. Free, open, built to share.