Vertex AI Model Garden adds Deploy from Hugging Face to deploy open LLMs to Vertex AI or GKE
AI Impact Summary
Google Cloud launches Deploy on Google Cloud integrated with the Hugging Face Hub to deploy thousands of foundation models as API Endpoints on Vertex AI or GKE. The rollout relies on Vertex AI Model Garden and Text Generation Inference to prefill deployment configs and enable one-click endpoints, starting with Zephyr Gemma. This dramatically reduces the ops overhead for productionizing open LLMs within Google Cloud, shortening the path from model discovery to live service. Enterprises should plan for model governance, access for gated models, and cost/performance management given the scale of open-model deployments.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info