ai-digest.dev
last updated 13 h ago
ResearcharXiv cs.AI 7 d ago

AfriSUD: A Dependency Treebank Collection for Evaluating Models on African Languages

AfriSUD is introduced as the first large-scale collection of syntactically annotated treebanks for nine African languages, utilizing the Surface-Syntactic Universal Dependencies (SUD) framework. This resource, verified by native speakers, highlights key syntactic features like agglutination and tone, and is used to evaluate various models—including non-transformer baselines, multilingual pretrained encoders, and LLMs—for tasks like part-of-speech tagging and dependency parsing. The findings indicate a significant syntax gap, demonstrating that current models struggle to effectively represent the structural diversity found in African languages, which is crucial for practitioners developing NLP solutions in these languages.

africanlanguagestreebanknlprelevance 0.00 · engagement 0.00
Read at source ↗← all news
AfriSUD: A Dependency Treebank Collection for Evaluating Models on African Languages — AI News Digest