Hugging Face: Build video generation datasets with Florence-2 and yt-dlp | SignalBreak | SignalBreak