v1.9.0: Trackio support, faster Diffusers loading, FP8/FDSP improvements, batch-size back-off change
AI Impact Summary
Trackio integration provides a local or Spaces-hosted Gradio dashboard for experiment tracking, improving observability and reproducibility for ML experiments run with accelerate. Diffusers model loading now speeds up by 4–5x due to set_module_tensor_to_device and non_blocking synchronization; a new clear_device option lets you skip tensor-cache clearing to shave startup time. FDSP, Deepspeed, and FP8 improvements, plus e5e2 support and hybrid defaults, strengthen distributed training stability and performance on multi-GPU setups. A breaking change changes batch-size back-off: find_executable_batch_size() now multiplies by 0.9 instead of halving after OOM, potentially requiring hyperparameter retuning and resource planning.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info