Research
Limited Marginal Benefit of Reasoning-Heavy LLM Deployment in ESG Narrative Scoring: A 4-Model Consensus Study on Japanese Listed Firms
This study evaluates the effectiveness of reasoning-heavy large language models (LLMs) for scoring ESG narratives of ten Japanese firms, comparing a reasoning-on model to three reasoning-off models. The results show negligible improvement, with a mean absolute deviation of only 0.38 on a 5-point scale, while the reasoning-on model incurs costs approximately 5.6 times higher than the ensemble of reasoning-off models. This suggests that for ESG narrative scoring, practitioners may achieve similar outcomes at significantly lower costs by utilizing reasoning-off models, highlighting the need for careful consideration of model deployment in cost-sensitive applications.
esgllmscoringmodel comparison