Groq
inference_host
82 signals tracked
Chat Completions: New Citation Options Parameter Released
Chat completions now support optional citation_options parameter, allowing users to enable/disable citations when model retrieves information from documents or web searches.
Date not specified
MediumCapabilityGroqCloud Status provides real-time system health monitoring
Date not specified
InfoCapabilityGroqCloud Status Page Launched
Date not specified
InfoCapabilityGroqCloud Status provides real-time system health monitoring
Date not specified
InfoCapabilityNew Chat Completion Configuration Options Added
New parameters introduced including citation_options, compound_custom, tool_choice, documents, and include_reasoning, expanding configuration flexibility for chat interactions.
Date not specified
MediumCapabilityGroqCloud Status: Real-time System Health Monitoring
Date not specified
InfoCapabilityGroq adds automatic prompt caching for openai/gpt-oss-120b
Groq has introduced automatic prompt caching for the openai/gpt-oss-120b model, providing 50% cost savings on cached input tokens, lower latency, and higher effective rate limits. The feature requires zero setup and automatically activates when requests share common prefixes with recent requests. Users will automatically benefit from this optimization without any required action.
1 Dec 2025
MediumCapabilityGroq SDK updates: Prompt caching and citation support added
Python SDK updated to v0.33.0 and TypeScript SDK to v0.34.0 with enhanced prompt caching and new annotation/citation features for chat completions. These are additive improvements that don't require immediate action but may benefit applications using chat functionality.
1 Dec 2025
MediumCapabilityGroq adds Google Workspace integration via MCP Connectors
Groq has introduced pre-built MCP Connectors that provide zero-configuration integration with Google Workspace applications including Gmail, Google Calendar, and Google Drive. These connectors use OAuth 2.0 authentication and are compatible with existing OpenAI Responses API workflows. No action required - this is a new optional feature that simplifies integration without needing custom MCP server development.
1 Dec 2025
InfoCapabilityGroq releases GPT-OSS-Safeguard 20B - open weight reasoning model
Groq launched GPT-OSS-Safeguard 20B, their first open weight reasoning model designed for Trust & Safety content moderation tasks. The model features 131K token context window, prompt caching for cost savings, and supports customizable policy-based classification. This enables organizations to implement bring-your-own-policy content moderation with structured reasoning capabilities.
1 Dec 2025
MediumCapabilityGroq adds Google Workspace integration via MCP Connectors
Groq introduced pre-built MCP Connectors that provide zero-configuration integration with Google Workspace services including Gmail, Google Calendar, and Google Drive. The connectors use OAuth 2.0 authentication and are compatible with existing OpenAI Responses API workflows. This is a new feature addition that enhances integration capabilities without requiring any changes to existing implementations.
1 Dec 2025
MediumCapabilityGroq releases GPT-OSS-Safeguard 20B - new open weight reasoning model
Groq launched GPT-OSS-Safeguard 20B, their first open weight reasoning model designed for Trust & Safety content moderation tasks. The model features a 131K token context window, custom policy support, and structured reasoning capabilities for automated content classification. This is a new feature release that expands Groq's model offerings for safety applications.
29 Oct 2025
MediumCapabilityGroq SDK updates: Prompt caching and annotation support added
Python SDK updated to v0.33.0 and TypeScript SDK to v0.34.0 with enhanced prompt caching capabilities and new annotation/citation support for chat completion messages. These are additive features that improve functionality without requiring immediate action from existing users.
21 Oct 2025
MediumCapabilityGroq introduces automatic prompt caching for gpt-oss-120b
Groq has introduced automatic prompt caching for the openai/gpt-oss-120b model, providing 50% cost reduction on cached input tokens and improved performance through lower latency and higher effective rate limits. The feature requires zero setup and automatically activates when requests share common prefixes with recent requests. This is a beneficial enhancement that users can immediately leverage without any configuration changes.
21 Oct 2025
MediumCapabilityGroq adds automatic prompt caching to openai/gpt-oss-20b
Groq has introduced automatic prompt caching for the openai/gpt-oss-20b model, providing 50% cost reduction on cached input tokens and improved response times through computation reuse. The feature requires zero setup and automatically activates when requests share common prefixes with recent requests. Users will automatically benefit from this optimization without any required action.
25 Sept 2025
MediumCapabilityGroq adds Remote Model Context Protocol (MCP) server integration in Beta
Groq has launched Beta support for Remote MCP server integration on GroqCloud, enabling AI models to connect to thousands of external tools through Anthropic's open MCP standard. The implementation is fully compatible with OpenAI's APIs, allowing developers to migrate from OpenAI to Groq without code changes while benefiting from faster execution and lower costs. This is a new feature addition that enhances existing capabilities without requiring immediate action.
23 Sept 2025
InfoCapabilityGroq adds Kimi K2-0905 model with 256K context window
Groq has launched Moonshot AI's Kimi K2-0905 model on GroqCloud, featuring the largest context window available (256K tokens) and prompt caching capabilities for up to 50% cost savings. The model offers enhanced agentic coding capabilities and improved frontend development performance at competitive pricing of $1.50/M tokens blended. This is a new model addition that expands available capabilities without requiring any changes to existing implementations.
5 Sept 2025
MediumCapabilityGroq SDK updated with OpenAI compatibility and Compound tools
Python SDK updated to v0.31.1 and TypeScript SDK to v0.32.0 with better OpenAI message type compatibility and bug fixes. Added support for new Groq Compound tools including Wolfram Alpha, Browser Automation, and Visit Website functionality. These are enhancement updates that improve functionality without requiring immediate action.
4 Sept 2025
MediumCapabilityGroq launches Compound model with enhanced accuracy and agentic tools
Groq released their new Compound model built on GPT-OSS-120B and Llama, delivering significantly improved performance with built-in server-side tools including web search, code execution, and browser automation. The model achieves general availability with increased rate limits and outperforms competing systems on key benchmarks. No immediate action required - this is a new capability enhancement for users.
4 Sept 2025
MediumCapabilityGroq launches automatic prompt caching for Kimi K2 model
Groq has introduced automatic prompt caching that reuses computation from recent requests sharing common prefixes, reducing latency and token costs by 50% for cached portions. The feature works automatically on all API requests with no code changes required and no additional fees, with cached data expiring within hours for privacy. Additional model support is planned for future rollout.
20 Aug 2025
MediumCapability
Get alerts for Groq
Never miss a breaking change. SignalBreak monitors Groq and dozens of other AI providers in real time.
Sign up free — no credit card required