InfoCapability

Google releases PaliGemma 2 – new vision language models

AI Impact Summary

Google has released PaliGemma 2, a new vision language model built on the Gemma 2 architecture. This update offers increased flexibility with three new model sizes (3B, 10B, and 28B parameters) and support for higher input resolutions (224x224, 448x448, and 896x896). The model leverages the SigLIP image encoder and is designed for easy fine-tuning on downstream tasks, building upon the success of the original PaliGemma model, which was widely adopted for its versatility.

Affected Systems

Gemma 2SigLIP

Date: Date not specified
Change type: capability
Severity: info

Google releases PaliGemma 2 – new vision language models

More from Hugging Face

Get alerts for Hugging Face