Together AI delivers industry-leading inference speeds for DeepSeek-R1-0528 on NVIDIA Blackwell
Action Required
Organizations can significantly reduce LLM inference costs and improve application response times by adopting Together AI’s new Blackwell-based inference platform.
AI Impact Summary
Together AI has achieved industry-leading inference speeds for DeepSeek-R1-0528 on NVIDIA Blackwell GPUs utilizing a new inference engine and optimized kernels. This represents a significant performance improvement over previous generations, offering a 2.3x to 2.8x speedup compared to H200 GPUs. This capability is particularly valuable for applications requiring fast and efficient large language model inference, enabling faster response times and reduced operational costs.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- high