InfoCapability

Liger GRPO training with Qwen2.5-0.5B-Instruct encounters shape mismatch

AI Impact Summary

A shape mismatch occurred during the training of the Liger GRPO model using Qwen/Qwen2.5-0.5B-Instruct with deepspeed zero3 and bf16 precision. This indicates an issue with the data or model configuration, likely related to tensor shapes not aligning during the forward pass within the `compute_liger_loss` function. The error highlights a potential problem with the implementation of the LigerFusedLinearGRPOFunction, requiring investigation into the data preprocessing, model architecture, or the training loop itself to resolve the shape incompatibility.

Affected Systems

Qwen/Qwen2.5-0.5B-Instructdeepspeed

Date: Date not specified
Change type: capability
Severity: info

Liger GRPO training with Qwen2.5-0.5B-Instruct encounters shape mismatch

More from Hugging Face

Get alerts for Hugging Face