Research
Influcoder: Distilling Decoders' Gradient Influence Rankings into an Encoder for Data Attribution
The article introduces Influcoder, a novel method for Data Attribution (DA) that distills the gradient influence rankings of decoders into a more efficient encoder framework. This approach addresses the limitations of existing influence function methods, which struggle with processing speed and storage when applied to large datasets. Influcoder aims to enable rapid and cost-effective influence-based DA, which is crucial for practitioners seeking to refine training datasets and mitigate issues like toxic behavior in LLM outputs.
data attributionllminfluence functions