Research
On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality Perspective
The paper critiques the RemOve-And-Retrain (ROAR) benchmark for evaluating feature attribution methods, revealing that post-processing transformations can artificially inflate ROAR scores without improving the actual information conveyed about model decisions. Through experiments on datasets like CIFAR-10, SVHN, and CUB-200, the authors find a correlation between spatially blurry masks and higher ROAR performance, suggesting that this bias undermines the benchmark's validity. The findings emphasize the need for more rigorous approaches to benchmarking that accurately reflect the mechanistic understanding of neural networks.
feature-attributionbenchmarkinformation-theory