SWE-bench Verified discontinued — migrate to SWE-bench Pro
Action Required
Users relying on SWE-bench Verified for coding performance evaluation must migrate to SWE-bench Pro to ensure accurate results.
AI Impact Summary
We are discontinuing the use of SWE-bench Verified due to concerns about its reliability and accuracy in measuring coding progress. The benchmark has been found to contain flawed tests and exhibit training leakage, leading to misleading results. This change encourages users to adopt SWE-bench Pro, which offers a more robust and trustworthy evaluation platform.
Affected Systems
- Date
- Date not specified
- Change type
- other
- Severity
- high