NVIDIA Llama Nemotron Nano VL 8B VLM released on Hugging Face Hub for intelligent document processing
AI Impact Summary
NVIDIA Llama Nemotron Nano VL is released on Hugging Face Hub as an 8B Vision-Language Model tailored for intelligent document processing. It combines Llama-3.1-8B-Instruct with the C-RADIOv2-VLM-H ViT backbone to deliver high-accuracy text recognition, table extraction, and grounding with bounding boxes across diverse documents. It can be post-trained with NVIDIA NeMo Retriever Parse and demonstrates strong OCRBench v2 performance, enabling deployment in finance/legal/government workflows; teams should evaluate on domain-specific documents and plan GPU-backed inference.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info