Voice intake — Path A (Claude brain)
OpenAI handles your mic (STT). Claude handles the conversation logic with our tuned prompts + tools. OpenAI synthesizes Claude's reply as audio. Compare to Path B (OpenAI does everything end-to-end).
Paste if you have one. Used the same way as text-mode intake.
A/B without redeploying. cartesia-sonic-3 is the new option (90ms first-byte, more natural). tts-1-hd is OpenAI's most natural option (~2× latency). Newer OpenAI voices (marin/cedar/verse) are gpt-4o-mini-tts only.
Live transcript
Start the intake to see the transcript appear here.
Path A characteristics: Claude (with our tuned prompts + tools) drives every turn. Real JobSpec lands in the dashboard on agent submit. Higher per-turn latency than Path B (~3-5 sec round-trip: STT → Claude → TTS). Interruption is limited — once Claude's audio starts playing, you can't cleanly cut it off mid-blob. Whisper hallucinations (random Korean during silence) are filtered server-side.