InfoCapability

Gemma4 on MLX (v0.20.8-rc0) — MoE + SWA performance optimizations

AI Impact Summary

The release of v0.20.8-rc0 introduces Gemma 4 to the MLX platform, leveraging a Mixture-of-Experts (MoE) architecture with Switchable Weight Averaging (SWA) prefill for improved performance. Key optimizations include memoizing the sliding-window prefill mask and applying Softmax only to selected experts within the Router.Forward pass, representing a targeted effort to enhance gemma4's efficiency on the text-only runtime.

Affected Systems

Gemma 4MLX

Date: Date not specified
Change type: capability
Severity: info

Gemma4 on MLX (v0.20.8-rc0) — MoE + SWA performance optimizations

More from Ollama

Get alerts for Ollama