Research
From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models
The paper presents a causal audit of expert importance in Mixture-of-Experts (MoE) models, specifically examining three architectures: OLMoE-1B-7B-0924, Qwen1.5-MoE-A2.7B, and DeepSeek-V2-Lite. The study finds that traditional observational metrics fail to predict causal expert importance, with effect sizes below Cohen's $d = 0.17$ across 60 combinations, challenging the validity of current pruning methods that rely on population-level summaries. This work emphasizes the necessity for rigorous interventional audits in interpretability practices, providing insights that may influence the development of more effective expert pruning strategies in MoE architectures.
mixture-of-expertsinterpretabilitycausal inference