Intel Introduces AutoRound: Advanced Quantization for LLMs and VLMs
Action Required
Organizations can deploy larger, more complex LLMs and VLMs on Intel hardware with improved performance and reduced resource requirements, enabling new use cases and reducing operational costs.
AI Impact Summary
Intel is introducing AutoRound, a new quantization tool designed to improve the efficiency of large language models (LLMs) and vision-language models (VLMs) by enabling low-bit quantization (INT2 - INT8) with minimal accuracy loss. This capability is particularly valuable for deploying these models on devices with limited resources, such as Intel A100 GPUs, and offers significant speed improvements compared to existing methods, with quantization taking just 37 minutes for a 72B model. This release represents a significant advancement in model optimization for Intel's hardware ecosystem.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- high