ResearcharXiv cs.CL — 7 d ago

Polar: A Benchmark for Evaluating Political Bias in LLMs

Polar is a newly introduced benchmark consisting of 4,026 instances designed to evaluate political bias in large language models (LLMs) through option-level likelihoods rather than prompt-based generation. It assesses models across two ideological axes and eight issue categories, revealing that 38 tested LLMs exhibit systematic bias variations influenced by political context, issue category, model group, and presentation language. This benchmark emphasizes the necessity for multilingual and cross-contextual evaluations of political bias, which is crucial for practitioners aiming to build fair and unbiased AI systems.

political-biasllmbenchmarkrelevance 0.00 · engagement 0.00

Read at source ↗← all news