Hugging Face: OpenAI releases Gaia2 benchmark and ARE framework for agent evaluation | SignalBreak | SignalBreak