HighCapability

OpenAI Consistency Diffusion Language Models (CDLM) - Up to 14x Faster Inference

Action Required

Organizations can significantly reduce the cost and latency of text generation tasks using CDLM, enabling faster application development and improved user experiences.

AI Impact Summary

This release introduces Consistency Diffusion Language Models (CDLM), a new training recipe that dramatically improves inference speed for diffusion language models by up to 14.5x without sacrificing quality. CDLM achieves this through block-wise KV caching and trajectory-consistent step reduction, enabling parallel generation and significantly reducing the number of refinement steps required. This is a capability update that will benefit applications requiring fast and efficient text generation, particularly those dealing with complex tasks like reasoning and math.

Affected Systems

GPT-4o-mini

Date: Date not specified
Change type: capability
Severity: high

OpenAI Consistency Diffusion Language Models (CDLM) - Up to 14x Faster Inference

More from Together AI

Get alerts for Together AI