OpenVINO acceleration for Stable Diffusion on Sapphire Rapids CPUs using OVStableDiffusionPipeline
AI Impact Summary
The post demonstrates hardware-accelerated Stable Diffusion inference on Sapphire Rapids CPUs by leveraging OpenVINO via OVStableDiffusionPipeline and Optimum Intel, complemented by memory and threading optimizations (jemalloc, libiomp, numactl) and optional IPEX BF16. It provides concrete latency figures: ~32.3s baseline per image, ~16.7s with OpenVINO, and ~4.7s with fixed input shapes, with up to ~10x faster performance versus Ice Lake when using Sapphire Rapids. For teams delivering CPU-only inference workloads, these techniques offer a clear migration path to achieve real-time or higher-throughput image generation; recommended steps include exporting the model with OVStableDiffusionPipeline and, if needed, enabling IPEX and bf16 optimizations.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info