Training
Can Post-Training Turn LLMs into Good Medical Coders? An Empirical Study of Generative ICD Coding
This study investigates the effectiveness of post-training techniques for generative large language models (LLMs) in automated International Classification of Diseases (ICD) coding, a critical task in medical billing and clinical support. It compares various methods including supervised fine-tuning (SFT) and reinforcement learning (RL), revealing that SFT significantly enhances performance, while the introduction of a new diagnostic curriculum (PHI) further improves code prediction. The findings indicate that the adaptation and optimization strategies are crucial for maximizing LLMs' coding capabilities, and the authors have made their code, data splits, and model checkpoints publicly available for further research.
ICD codingpost-traininglarge language models