Gemma4 v5.5.2 patch: fix use_cache inference, remove shared weights, update VLMS mappings
AI Impact Summary
Gemma4 v5.5.2 patch targets an inference issue when use_cache=False caused by kv-state sharing across layers. It removes all shared weights to prevent cross-layer leakage and silently skips them during loading, and fixes conversion mappings for VLMS to ensure weight names serialize consistently. The PRs also indicate MoE support in the Gemma4 TP plan, signaling architectural expansion; overall, deployments using Gemma4 and VLMS will see more robust loading and inference with this patch.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info