HighCapability

OpenAI Releases New Speech-to-Text and Text-to-Speech Models (GPT-4o, GPT-Realtime-1.5, GPT-Audio-1.5)

Action Required

Developers using OpenAI's speech-to-text and text-to-speech models must migrate to the new versions to benefit from improved accuracy, performance, and new features, ensuring continued functionality of voice-based applications.

AI Impact Summary

OpenAI has released several significant updates to its speech-to-text and text-to-speech models, including GPT-Realtime-1.5, GPT-Audio-1.5, GPT-image-1.5, and GPT-4o models. These updates focus on improved transcription accuracy, multilingual performance, and more natural speech synthesis, with key advancements in real-time audio processing, diarization, and new voice models. Developers should migrate to these new models to leverage the enhanced capabilities and lower word error rates.

Affected Systems

GPT-4o

Date: Date not specified
Change type: capability
Severity: high

OpenAI Releases New Speech-to-Text and Text-to-Speech Models (GPT-4o, GPT-Realtime-1.5, GPT-Audio-1.5)

More from Azure OpenAI

Get alerts for Azure OpenAI