Weaviate now supports the Sphere Dataset
AI Impact Summary
Meta has released the Sphere dataset, a 134 million document collection, into Weaviate, offering a large-scale knowledge base for applications like question-answering and fact-checking. The available import methods – Python client and Spark connector – provide flexibility for developers, with the Python client offering a simpler, 75-line solution and Spark enabling large-scale data processing for training LLMs. This release simplifies access to a powerful dataset, but the initial dataset size of 899 million lines presents a significant technical challenge for local development and deployment.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info