InfoCapability

RWKV: RNN-Transformer hybrid now supported in Hugging Face Transformers

AI Impact Summary

RWKV introduces a transformer-like attention-free RNN variant and is now integrated into Hugging Face Transformers, enabling deployment via familiar APIs and HF Hub. It supports very long context lengths (ctx8192), scalable models from 170M to 14B parameters, and a chat-optimized RWKV-4 Raven variant fine-tuned on ALPACA, CodeAlpaca, Guanaco, GPT4All, ShareGPT, among others, with performance improvements aided by tricks like TokenShift and SmallInitEmb. Technical teams should evaluate RWKV for long-context chat applications, plan to load via the Transformers integration (potentially from source or main branch), and account for tokenization/quantization considerations and existing fine-tuning pipelines.

Affected Systems

RWKVRWKV-4 Raven

Date: Date not specified
Change type: capability
Severity: info

RWKV: RNN-Transformer hybrid now supported in Hugging Face Transformers

More from Hugging Face

Get alerts for Hugging Face