InfoCapability

Structured CodeAgents: JSON-guided code actions improve tool use on OpenAI function calling API and capable models (Claude, Qwen)

AI Impact Summary

Researchers show that forcing LLMs to emit a JSON object with thoughts and executable Python code improves tool orchestration and accuracy across GAIA, MATH, SimpleQA, and Frames benchmarks compared with standard CodeAgent and function-calling approaches. Using smolagents to parse a strict thoughts/code JSON eliminates markdown/code parsing errors and enforces planning before execution, reducing cascading failures. Benefits appear strongest with capable models (32B+ parameters or frontier models) and may incur a 'structure tax' on smaller models (e.g., mistralai/Mistral-7B-Instruct-v0.3) that degrade performance. Migration requires updating the agent’s output format to include thoughts and code in a JSON blob and adjusting tool-invocation pipelines to parse and execute the code blocks safely.

Affected Systems

CodeAgent

Date: Date not specified
Change type: capability
Severity: info

Structured CodeAgents: JSON-guided code actions improve tool use on OpenAI function calling API and capable models (Claude, Qwen)

More from Hugging Face

Get alerts for Hugging Face