InfoCapability

rinna releases Japanese Stable Diffusion with Japanese prompts and 2-stage training

AI Impact Summary

rinna has released a Japanese-specific fine-tuning of Stable Diffusion, enabling native Japanese prompts to generate culturally aligned imagery. The model uses a two-stage training approach—replacing the English text encoder with a Japanese encoder and jointly fine-tuning the encoder and latent diffusion model—paired with a Japanese tokenizer to avoid CLIP tokenization issues. Training leveraged ~100M Japanese-captioned images, filtered with japanese-cloob-vit-b-16 and the Japanese subset of LAION-5B, and is hosted on Hugging Face and GitHub via the Diffusers ecosystem. This broadens multilingual capabilities for image generation ecosystems and reduces the need for translation or post-editing to achieve authentic Japanese visuals.

Affected Systems

Stable DiffusionJapanese Stable Diffusion (rinna)

Date: Date not specified
Change type: capability
Severity: info

rinna releases Japanese Stable Diffusion with Japanese prompts and 2-stage training

More from Hugging Face

Get alerts for Hugging Face