Together AI expands audio capabilities with real-time TTS and STT
AI Impact Summary
Together AI has significantly expanded its audio capabilities with real-time streaming capabilities via WebSocket APIs for both Text-to-Speech (TTS) and Speech-to-Text (STT). This includes new models like Orpheus 3B and Kokoro 82M for TTS, and the Mistral AI Voxtral model for STT, alongside speaker diarization. The introduction of these features represents a substantial shift towards interactive, low-latency audio applications.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info