Streaming TTS Explained

Why your AI co-hosts sometimes respond instantly and sometimes take a few seconds. Learn how streaming TTS creates natural conversations.

Total time · 5Difficulty · BeginnerHostFabric Control access
01 · Understand the two TTS methods
1 min

HostFabric uses two different audio playback systems depending on your TTS provider. Streaming TTS (ElevenLabs) starts playing immediately, while traditional TTS waits for complete audio generation.

  • Check your TTS provider in Settings → Voices
  • Look for "streaming" indicator in debug mode
02 · Experience streaming TTS
2 min

With ElevenLabs, responses start playing within 300-600ms. This creates natural conversation flow where AI can interrupt or respond quickly to live chat.

  • Switch to ElevenLabs TTS if not already using it
  • Send a short prompt and notice the fast response
  • Try interrupting with another prompt mid-response
03 · Compare with traditional TTS
1 min

Google TTS and OpenAI TTS use traditional methods with 2-4 second delays. Perfect for pre-recorded content but slower for live interaction.

  • Switch to Google TTS to see the difference
  • Notice the longer wait before audio starts
  • See how responses feel more "robotic" in timing
04 · Optimize for your streaming style
1 min

Choose ElevenLabs for live, interactive shows. Use Google/OpenAI TTS for tutorials, presentations, or when perfect lip-sync is critical.

  • Pick TTS based on your content type
  • Test both methods during rehearsal
  • Consider having backup TTS configured
Pro tip: Streaming TTS works best with ElevenLabs. If you're doing live conversations, it's worth the small premium for that natural, responsive feel.