HighCapability

OpenAI releases GPT-4o audio models for transcription and text-to-speech

Action Required

Developers can now leverage these new audio models to build real-time applications with improved accuracy and multilingual support, expanding the use cases for voice-based interactions.

AI Impact Summary

OpenAI has released new audio models, gpt-4o-transcribe and gpt-4o-mini-transcribe, for speech-to-text and text-to-speech, accessible through the /audio and /realtime APIs. These models offer improved accuracy, robustness, and multilingual support, particularly for real-time scenarios. This capability expands Azure OpenAI's audio processing capabilities and is valuable for applications like customer support and virtual meetings.

Affected Systems

gpt-4o-transcribe

Date: Date not specified
Change type: capability
Severity: high

OpenAI releases GPT-4o audio models for transcription and text-to-speech

More from Azure OpenAI

Get alerts for Azure OpenAI