Research
IndicContextEval: A Benchmark for Evaluating Context Utilisation in Audio Large Language Models Across 8 Indic Languages
IndicContextEval is a newly introduced benchmark aimed at assessing context utilization in audio large language models (AudioLLMs) across eight Indic languages. This benchmark features 56 hours of multilingual natural speech data from 555 speakers and employs a 7-level prompting framework to systematically evaluate how well models leverage contextual information. The findings from evaluating five different models indicate significant variations in their ability to utilize context, underscoring the importance of explicit contextual evaluation for practitioners working with AudioLLMs.
benchmarkcontextaudio