Bringing VLA models to NXP i.MX 95: asynchronous inference, per-block optimization, and dataset best practices
AI Impact Summary
This report outlines how to bring Vision-Language-Action (VLA) capabilities to constrained embedded robotics. It argues that real-time control is achieved by asynchronous inference and a latency-aware, multi-block deployment (Vision, LLM backbone, Action expert) on the NXP i.MX 95, with per-block optimization and quantization to balance latency and accuracy. It also provides concrete data-collection playbooks (rigid mounts, fixed lighting, a gripper camera) and fine-tuning guidance for ACT and SmolVLA, illustrating a practical path to deploy VLA on edge hardware with eIQ Neutron NPU support.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info