InfoCapability

Google Cloud TPUs now supported on Hugging Face Inference Endpoints and Spaces (TPU v5e)

AI Impact Summary

HF now supports Google Cloud TPU v5e on Inference Endpoints and Spaces, enabling TPU-accelerated deployment of LLMs via Optimum TPU and Text Generation Inference. The offering provides three pod configurations (v5litepod-1/4/8) in us-west1 with explicit pricing and memory guidance to help limit memory budget issues for larger models. Supported deployable models include Gemma, Llama, and Mistral, giving teams a tangible path to lower latency and improved cost efficiency for large-model inference on HF platforms.

Affected Systems

Hugging Face Inference EndpointsHugging Face Spaces

Date: Date not specified
Change type: capability
Severity: info

Google Cloud TPUs now supported on Hugging Face Inference Endpoints and Spaces (TPU v5e)

More from Hugging Face

Get alerts for Hugging Face