HighCapability

OpenAI launches FutureBench: AI agent benchmark for forecasting real-world events

Action Required

Organizations relying on AI-driven forecasting and decision-making will need to evaluate and potentially adopt FutureBench to assess and improve their agents' predictive capabilities.

AI Impact Summary

OpenAI is introducing a new benchmark, FutureBench, designed to evaluate AI agents' ability to predict real-world events like market movements and geopolitical developments. This is a significant shift from traditional benchmarks that rely on static datasets or pattern matching, as FutureBench emphasizes genuine reasoning and forecasting capabilities. The benchmark's use of live news and prediction markets provides a verifiable, time-stamped measure of model performance, directly addressing the limitations of current evaluation methods.

Models affected

Date: Date not specified
Change type: capability
Severity: high

OpenAI launches FutureBench: AI agent benchmark for forecasting real-world events

More from Together AI

Get alerts for Together AI