SmolLM3: New 3B Multilingual Long-Context Reasoner Released
Action Required
Organizations can now leverage a high-performing, efficient, and multilingual language model for a wide range of applications, potentially reducing infrastructure costs and development time.
AI Impact Summary
SmolLM3 is a new 3B multilingual model offering competitive performance against larger 4B models like Qwen3 and Gemma3. The model's key differentiators are its efficiency, support for 6 languages, and a long context window of 128k tokens, achieved through architectural optimizations like GQA, NoPE, and YaRN. The release includes a detailed engineering blueprint outlining the model's training data mixture, architecture, and training configuration, providing valuable insights for developers and researchers.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- medium