Text2SQL with DuckDB-NSQL-7B via Hugging Face Dataset Viewer and MotherDuck
AI Impact Summary
The proposal demonstrates deploying DuckDB-NSQL-7B, fine-tuned for DuckDB SQL, to translate natural language prompts into valid DuckDB queries. It wires Hugging Face's Dataset Viewer API to expose dataset schemas (parquet-backed) and uses a local LLM inference path (llama.cpp with GGUF) to generate SQL, enabling non-developers to interrogate datasets via natural language. This lowers the barrier to data exploration and accelerates prototyping in BI and analytics workflows, especially for datasets like world-cities-geo. Operators should plan for model hosting, latency, and validation of generated SQL to avoid incorrect queries or data leakage.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info