ResearcharXiv cs.CL — 7 d ago

ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models

ChiKhaPo is a newly introduced multilingual benchmark designed to assess lexical comprehension and generation in large language models (LLMs), covering over 2700 languages across 8 subtasks of varying difficulty. The benchmark reveals that six state-of-the-art models exhibit significant performance limitations, highlighting challenges related to language family, resource availability, and task complexity. This tool aims to facilitate comprehensive evaluation and improvement of LLMs' linguistic capabilities in low-resource languages, thereby advancing multilingual AI research.

llmbenchmarklexical-comprehensionrelevance 0.00 · engagement 0.00

Read at source ↗← all news