PerplexityAzure OpenAIOpenSearch (AWS)Together AIElasticGoogle Gemini / Vertex AICohereMistral AIReplicate

OpenAI's GPT-5 Launch Dominates: Week of 28 July 2025

28 Jul 2025 – 4 Aug 20255 min read

OpenAI's GPT-5 Launch Dominates: Week of 28 July 2025

OpenAI has unleashed its GPT-5 model family this week, fundamentally shifting the competitive landscape with four new variants now available across Azure platforms. Meanwhile, Perplexity's sudden deprecation of its R1-1776 model serves as a stark reminder that even established models can vanish overnight, forcing immediate migration decisions.

The Big Moves

OpenAI's GPT-5 Family Arrives with Azure Integration

The launch of GPT-5, GPT-5-mini, GPT-5-nano, and GPT-5-chat represents OpenAI's most significant model release since GPT-4. What makes this particularly compelling isn't just the performance improvements, but the immediate availability through Azure AI Foundry's Model Router. This means existing Azure customers can access GPT-5 capabilities through their current Completions API without code changes, eliminating the typical integration friction that accompanies major model updates.

The strategic implications are considerable. GPT-5-nano and GPT-5-mini offer cost-optimised alternatives for applications that don't require the full capabilities of the flagship model, whilst GPT-5-chat appears positioned for conversational applications. This tiered approach suggests OpenAI is responding to enterprise demands for more granular cost control and performance optimisation.

For organisations currently running GPT-4 workloads on Azure, the migration path is refreshingly straightforward. The Model Router handles the complexity, allowing teams to evaluate GPT-5 performance against their existing implementations without architectural changes. However, pricing details remain unclear, and early adopters should monitor costs carefully as they transition from established GPT-4 pricing models.

Azure's Real-Time Audio and Video Capabilities Reach GA

Azure AI Foundry's general availability of GPT RealTime and Audio models marks a significant maturation of real-time AI capabilities. The inclusion of GPT-Realtime-1.5, GPT-Audio-1.5, and GPT-image-1.5 alongside SIP support and improved transcription accuracy addresses enterprise concerns about production readiness for real-time applications.

The SIP integration is particularly noteworthy for organisations building telephony applications. Previously, integrating real-time AI with existing phone systems required complex middleware solutions. This native support could accelerate adoption in customer service, sales, and support applications where voice interaction quality directly impacts business outcomes.

Sora's new image-to-video generation capability, available in Sweden Central and East US 2, adds another dimension to Azure's creative AI offerings. The ability to specify frame positioning provides the control that professional video creators demand, though the regional limitations suggest this remains a capacity-constrained feature.

Perplexity's R1-1776 Model Deprecation Forces Immediate Action

Perplexity's permanent removal of the R1-1776 model effective 1 August represents the week's most disruptive change. Unlike typical deprecations that offer extended transition periods, this removal is immediate and final. Applications relying on R1-1776 will simply stop working, forcing emergency migrations to Sonar Pro Reasoning or alternative models.

This deprecation highlights a broader trend towards model consolidation as providers focus resources on their most capable offerings. R1-1776's lack of recent improvements and missing support for newer features made it a maintenance burden rather than a competitive advantage. However, the abrupt timeline suggests either technical issues or strategic decisions that prioritised resource allocation over customer convenience.

Organisations affected by this change should treat it as a wake-up call about dependency management. Relying on a single model from a single provider without fallback options creates unnecessary business risk, particularly as the AI landscape continues to evolve rapidly.

Worth Watching

OpenSearch Serverless Gains Disaster Recovery

Amazon's introduction of automatic hourly snapshots for OpenSearch Serverless addresses a critical gap in enterprise data protection. The elimination of manual configuration removes a common source of backup failures, whilst point-in-time recovery capabilities provide the granular control that compliance requirements often demand. For organisations running critical search infrastructure, this represents essential operational maturity rather than optional enhancement.

Cohere Enters Multimodal Territory

Cohere's Command A Vision launch marks the company's entry into multimodal processing, combining text and vision capabilities within their existing API framework. The seamless integration with the current Command API reduces adoption friction, though the competitive landscape already includes established players like GPT-4V and Google's multimodal offerings. Cohere's enterprise focus could differentiate this offering, particularly for organisations seeking alternatives to the dominant providers.

Together AI Expands Safety and Evaluation Capabilities

Together AI's integration of VirtueGuard AI security and launch of their Evaluations Framework demonstrates growing enterprise focus on AI governance. VirtueGuard's 8ms response time addresses performance concerns that have limited real-time safety monitoring adoption, whilst the benchmarking framework provides systematic model comparison capabilities. These tools become increasingly valuable as organisations deploy AI in production environments where safety and performance directly impact business outcomes.

Elastic's Major Platform Updates

Elastic's simultaneous release of versions 9.1 and 8.19 introduces BBQ by default, ES|QL with CCS GA, and Azure AI Foundry integration. The BBQ performance improvements and simplified data models address long-standing scalability concerns, whilst the Azure integration positions Elastic as a monitoring solution for AI model deployments. These updates provide immediate value for existing users whilst expanding Elastic's relevance in AI infrastructure monitoring.

Quick Hits

Google's Veo 3 models reach general availability on Vertex AI, adding video generation capabilities with cost-effective Veo 3.1 Lite option. Mistral's Codestral 2508 offers updated code generation through the codestral-2508 endpoint. Replicate enhances platform usability with improved search, enterprise features, and playground filtering. Elasticsearch 8.19.0 and 9.1.0 deliver standard feature releases with performance improvements.

The Week Ahead

The 1 August deadline for Perplexity's R1-1776 deprecation will test how quickly affected organisations can execute emergency migrations. Watch for performance comparisons between GPT-5 variants as early adopters share initial benchmarks. Azure's expanding real-time capabilities warrant monitoring for pricing announcements and regional availability updates.

Google's Veo 3 GA status suggests increased competition in video generation, potentially pressuring other providers to accelerate their own video AI roadmaps. The convergence of safety monitoring tools like VirtueGuard with evaluation frameworks indicates growing enterprise demand for comprehensive AI governance solutions.

Expect continued model consolidation across providers as they focus resources on their most competitive offerings. The R1-1776 deprecation won't be the last sudden model retirement as the industry matures and providers optimise their portfolios for profitability rather than breadth.