HighCapability

Nvidia Nemotron OCR v2: Fast Multilingual OCR with Synthetic Data

Action Required

Organizations can now reliably extract text from documents in multiple languages with significantly improved accuracy and speed, eliminating the need for expensive manual annotation.

AI Impact Summary

Nvidia has released Nemotron OCR v2, a significantly improved multilingual OCR model built using a synthetic data pipeline. The key innovation is the generation of 12 million synthetic training images across six languages, dramatically boosting accuracy compared to the previous v1 model, which struggled with non-English languages due to a limited character set and lack of training data. The model’s speed is also enhanced through a shared detection backbone, enabling 34.7 pages per second on an A100 GPU, and the synthetic data pipeline is generic enough to be applied to any language with available fonts and source text.

Affected Systems

GPT-4o-mini

Date: 17 Apr 2026
Change type: capability
Severity: high

Nvidia Nemotron OCR v2: Fast Multilingual OCR with Synthetic Data

More from Hugging Face

Get alerts for Hugging Face