Deploy on Google Cloud enables Hugging Face Hub LLMs deployment to Vertex AI Model Garden and GKE
AI Impact Summary
Google Cloud is launching Deploy on Google Cloud, a Hugging Face Hub integration that lets you deploy thousands of open foundation models to Vertex AI or GKE. This streamlines producing production-ready endpoints directly from Hugging Face model cards and via Vertex Model Garden, reducing the operational overhead of hosting and serving open models. The change enables rapid production deployments for Generative AI apps, but requires governance over access for gated models and careful security/configuration in Vertex AI and GKE environments (e.g., token management for gated models).
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info