Research
MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes
MoReBench is introduced as a new benchmark consisting of 1,000 moral scenarios designed to evaluate procedural and pluralistic moral reasoning in language models, featuring over 23,000 expert-defined criteria. The benchmark includes MoReBench-Theory, which tests AI reasoning against five normative ethical frameworks, revealing that current scaling laws and benchmarks for math and code do not effectively predict moral reasoning capabilities. This work highlights the need for more nuanced evaluations of AI decision-making processes, aiming to enhance the transparency and safety of AI systems in moral reasoning contexts.
moral-reasoningllmevaluation