OpenAI Releases New Speech-to-Text and Text-to-Speech Models (GPT-4o, GPT-Realtime-1.5, GPT-Audio-1.5)
Action Required
Developers using OpenAI's speech-to-text and text-to-speech models must migrate to the new versions to benefit from improved accuracy, performance, and new features, ensuring continued functionality of voice-based applications.
AI Impact Summary
OpenAI has released several significant updates to its speech-to-text and text-to-speech models, including GPT-Realtime-1.5, GPT-Audio-1.5, GPT-image-1.5, and GPT-4o models. These updates focus on improved transcription accuracy, multilingual performance, and more natural speech synthesis, with key advancements in real-time audio processing, diarization, and new voice models. Developers should migrate to these new models to leverage the enhanced capabilities and lower word error rates.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- high