Safety
Gender Bias in LLM Hiring Decisions: Evidence from a Japanese Context and Evaluation of Mitigation Strategies
This study investigates gender bias in large language models (LLMs) during hiring processes in Japan, using 60 rirekisho-format resumes and five LLMs (Claude Sonnet 4.6, GPT-4o, DeepSeek-V3, Gemini 2.5 Flash, Llama 3.3 70B) with 43,200 API calls. The findings reveal a significant pro-female bias consistent with Western studies, with candidate names identified as the primary factor influencing bias, while mitigation strategies, such as prompt-level gender-neutrality instructions, proved ineffective. Additionally, an incompatibility between GPT-4o's privacy filter and content safety filter resulted in a 42% refusal rate, underscoring challenges in implementing name anonymization in LLM-driven recruitment systems.
gender-biasllmhiringmitigation