Groq adds automatic prompt caching for openai/gpt-oss-120b
AI Impact Summary
Groq has introduced automatic prompt caching for the openai/gpt-oss-120b model, offering significant cost savings (50%) and improved performance. This capability automatically optimizes token usage by leveraging shared prefixes between requests, reducing latency and increasing effective rate limits. Users automatically benefit from this feature without any configuration changes, representing a substantial operational improvement.
Affected Systems
Business Impact
Users of the openai/gpt-oss-120b model will experience reduced costs and improved performance due to automatic prompt caching.
- Date
- Date not specified
- Change type
- capability
- Severity
- medium