Training
Learning Fair Pareto-Optimal Policies in Multi-Objective Reinforcement Learning
The article presents a novel approach to multi-objective reinforcement learning (MORL) by formalizing the fair optimization problem in a multi-policy context, aiming to learn a diverse set of Pareto-optimal policies that ensure fairness across varying user preferences. Key contributions include the introduction of three algorithms that integrate the generalized Gini welfare function (GGF) with multi-policy multi-objective Q-Learning (MOQL), support non-stationary policies, and enable stochastic policy learning. Empirical evaluations demonstrate that these methods outperform state-of-the-art MORL baselines, providing practitioners with effective tools for addressing fairness in dynamic decision-making environments.
reinforcement-learningfairnessmulti-objective