SafetyarXiv cs.CL — 8 d ago

Language Shapes Mental Health Evaluations in Large Language Models

The study examines the impact of language on mental health evaluations using multilingual large language models (LLMs), specifically GPT-4o and Qwen3-32B, in an English-Chinese context. It finds that Chinese prompts lead to higher stigma scores and more conservative depression severity classifications compared to English, indicating that language can influence both evaluative orientation and decision-making in mental health assessments. This underscores the necessity for practitioners to rigorously evaluate multilingual LLMs for consistency in performance and evaluative standards in sensitive applications.

mental healthllmevaluationrelevance 0.00 · engagement 0.00

Read at source ↗← all news