LeRobotDataset v3.0: Multi-episode format and streaming for lerobot
AI Impact Summary
LeRobotDataset v3.0 changes the storage model from one episode per file to multi-episode bundles with relational metadata, enabling scalable indexing and episode-level access from large merged files. The introduction of StreamingLeRobotDataset supports on-the-fly data processing directly from the Hugging Face Hub, reducing the need to download entire datasets and accelerating training pipelines for large robotics datasets. A migration path exists via a one-liner converter and integration with lerobot v0.4.0+, with installation through a PyPI URL, and the new metadata layout (meta/info.json, meta/stats.json, meta/episodes/, data/, videos/) is designed to support episode-level querying across consolidated files.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info