Hugging Face: Fine-tune Stable Diffusion with DDPO via TRL using DDPOTrainer | SignalBreak | SignalBreak