Hugging Face: TRL + PEFT enable RLHF fine-tuning of 20B LLMs on 24GB GPUs | SignalBreak | SignalBreak