InfoCapability

Optimum + ONNX Runtime Training accelerates Hugging Face model training (up to 130% with DeepSpeed)

AI Impact Summary

Hugging Face Optimum now integrates ONNX Runtime Training to accelerate fine-tuning of large language, speech, and vision models, delivering 35%+ speedups and up to 130% when combined with DeepSpeed ZeRO-1. The gains stem from memory and compute optimizations (memory planning, kernel optimizations, multi-tensor apply for Adam, FP16/mixed precision, graph fusions) and through the ORTTrainer/ORTTrainingArguments API, enabling seamless composition with DeepSpeed and easier hardware utilization across NVIDIA and AMD GPUs. For teams, this provides faster training cycles with a clear migration path from Trainer to ORTTrainer and from TrainingArguments to ORTTrainingArguments, plus straightforward export to ONNX after training.

Affected Systems

Hugging Face OptimumONNX Runtime

Date: Date not specified
Change type: capability
Severity: info

Optimum + ONNX Runtime Training accelerates Hugging Face model training (up to 130% with DeepSpeed)

More from Hugging Face

Get alerts for Hugging Face