Safety
Mental-R1: Aligning LLM Reasoning for Mental Health Assessment
The article introduces Mental-R1, a model designed for mental health assessment that utilizes a new reinforcement learning framework called Cognitive Relative Policy Optimization (CRPO). This approach incorporates stage-dependent uncertainty modeling and a stage-wise entropy regularization mechanism, which allows the model to mimic human cognitive processes in reasoning, resulting in a 10.4 percentage point improvement in weighted F1-score over existing baselines across eight mental health datasets. This advancement is significant for practitioners as it enhances the reliability of LLMs in critical mental health evaluations, potentially leading to better intervention outcomes.
mental-healthllmreinforcement-learning