Open Leaderboard for Hebrew LLMs launched — evaluation via HuggingFace Endpoints and lighteval
AI Impact Summary
Open Leaderboard for Hebrew LLMs introduces an open benchmarking platform tailored to Hebrew NLP. The evaluation uses four tasks (Hebrew Question Answering from the HeQ test subset, Hebrew sentiment, Winograd Schema pronoun resolution, and English-Hebrew translation) with few-shot prompts, deployed via HuggingFace Inference Endpoints and scored with the lighteval library. This setup addresses Hebrew's morphological complexity and tokenization quirks, enabling apples-to-apples comparisons of models and informing deployment readiness and fine-tuning priorities.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info