Hugging Face: TRL: Introducing MPO, GRPO, and GSPO for Vision Language Model Alignment | SignalBreak | SignalBreak