Research
Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics
The article presents a novel approach to detecting hallucination onset in language models by framing it as a quickest change detection problem, utilizing a first-order Markov model to establish a theoretical lower bound on detection delay. The proposed causal recurrent labeler demonstrates improved performance, detecting hallucinations in 11-13 tokens compared to 31 tokens for a linear baseline at a matched false-alarm rate. This work is significant for practitioners as it highlights the importance of temporal dynamics in detection mechanisms, offering insights into optimizing real-time monitoring of model outputs.
hallucinationdetectioncusumchange-pointllm