Falcon Mamba: 7B Attention-Free Model Released by TII
AI Impact Summary
Technology Innovation Institute (TII) has released Falcon Mamba, a novel 7B parameter language model utilizing a state space model (SSM) architecture without attention. This model demonstrates competitive performance against existing transformer models like Llama and Mistral, particularly in handling long sequences, thanks to its linear-time scaling. Falcon Mamba’s key advantage lies in its ability to process arbitrarily long sequences without memory or compute scaling, achieved through RMS normalization and a design that avoids the quadratic scaling limitations of attention mechanisms, offering a significant performance and efficiency improvement.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info