OpenAI releases GPT-4o audio models for transcription and text-to-speech
Action Required
Developers can now leverage these new audio models to build real-time applications with improved accuracy and multilingual support, expanding the use cases for voice-based interactions.
AI Impact Summary
OpenAI has released new audio models, gpt-4o-transcribe and gpt-4o-mini-transcribe, for speech-to-text and text-to-speech, accessible through the /audio and /realtime APIs. These models offer improved accuracy, robustness, and multilingual support, particularly for real-time scenarios. This capability expands Azure OpenAI's audio processing capabilities and is valuable for applications like customer support and virtual meetings.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- high