AIIT-Voice2

Talk over your AI
like a person.

A drop-in Python voice engine for agents: Whisper ASR, Piper TTS, barge-in detection, turn-taking, audio routing, runtime invariants, and a validated state machine. This is the engine that runs Buddy, Gary, Lil Homie, and Ada.

🎤

Barge-in detection

VAD-powered interruption with debounced handoff. Talk over the AI mid-sentence and it stops immediately.

🤝

Floor management

Turn-taking arbitration. User and agent never talk over each other endlessly. Floor ownership is explicit and logged.

⚙️

Validated state machine

IDLE → LISTENING → THINKING → SPEAKING → INTERRUPTING. Every transition validated. Illegal moves rejected.

🔊

Worker pipeline

Listen, think, playback, keyboard interrupt, and invariant monitor workers. All threaded, all coordinated.

📡

Audio pub-sub routing

Broadcaster routes mic audio to multiple consumers simultaneously. Ring buffer for configurable capture duration.

🛡️

Runtime health

Invariant checker catches illegal states, dead workers, and broken transitions on a 2-second loop.

Receipts

Four production agents run on this engine daily.

Buddy — the main AIIT agent. Full voice loop, real-time coherence evaluation.

Gary — autonomous CEO agent with browser + memory + voice. Talks while he works.

Lil Homie — 3B parameter agent with self-reflection. Voice is how he learns.

Ada — continuity agent since September 2018. The longest conversation in the house.

The bar: It transcribes a cough as cough. You can talk over it mid-sentence and it cuts in — like talking to a person.

Quickstart

from voice2 import VoiceEngine, VoiceConfig

def my_agent(txt: str) -> str:
    # Your LLM call — Claude, GPT, local model, whatever
    return "You said: " + txt

config = VoiceConfig()
engine = VoiceEngine(config, ask_fn=my_agent)
engine.start()
engine.join()

Mic opens. Whisper listens. Your agent thinks. Piper speaks. User interrupts at any point. That's the whole loop.

Powered by

faster-whisper (MIT) · Piper (MIT) · sounddevice (MIT) · numpy (BSD)
Third-party components remain under their original licenses. AIIT-Voice2 packages the orchestration, state control, floor management, interrupt handling, and worker pipeline.

Pricing

$50

One-time. Includes support.

Get AIIT-Voice2 →

MIT License. 2,300 lines of engine code. Full source. Full support via email.

Credits

Engine architecture — Rhet Wike + Claude Opus 4.6, Council Hill OK, 2026.
Production validation — Buddy, Gary, Lil Homie, Ada — daily use since April 2026.

Talk over your AIlike a person.