Research
Harsher on Male? Evaluating LLMs on Gender-Asymmetric Moral Framing Across Diverse Conflict Scenarios
This article introduces GAMA-Bench, a new benchmark comprising 1,298 gender-mirrored scenarios designed to evaluate gender-asymmetric moral framing in large language models (LLMs). The study reveals that across 10 LLMs, male actors are subjected to more punitive and blame-oriented responses, while female actors receive more empathetic treatment, indicating a consistent bias in moral evaluations regardless of model family or scale. This finding is crucial for practitioners as it highlights potential biases in LLM outputs, emphasizing the need for careful consideration of gender dynamics in AI applications.
llmgender-biasmoral-framing