Research
Code-Switching Reveals Language Anchoring in Multilingual LLMs
The paper presents a study on the performance degradation of Multilingual Large Language Models (MLLMs) when handling Code-Switched (CS) inputs, introducing the concept of Anchor Bias to measure language anchoring in hidden states. It proposes a novel intervention called CANVAS (Contextual Anchor-based Neural Vector Alignment Steering) that aligns target-language hidden states with source-language representations, effectively improving Question Answering (QA) performance across various MLLMs. This work highlights the importance of understanding internal anchoring mechanisms in MLLMs to enhance their capabilities in processing mixed-language inputs.
multilingualLLMcode-switching