Research
Layer-Resolved Optimal Transport for Hallucination Detection in NMT and Abstractive Summarization
The paper introduces a layer-resolved optimal transport (OT) method for detecting hallucinations in neural machine translation (NMT) and evaluates its applicability to abstractive summarization. It analyzes the Fairseq DE-EN model across its six decoder layers, revealing that layers L1 to L4 are most effective for detection, while L5 exhibits anti-predictive behavior. The unsupervised OT detector shows balanced accuracy of 57.2% on CNN and 57.6% on XSum for summarization, highlighting limitations in detecting unfaithful summaries that still attend correctly to source tokens, thus providing insights into the interpretability of attention mechanisms in LLMs.
hallucinationnmtoptimal-transport