Research
Towards Engineering Scaling Laws with Pretraining Data Composition
The paper discusses the engineering of scaling laws in neural networks by manipulating pretraining data composition, specifically in the context of classifying hadronic jets from high-energy particle collisions. It highlights that, unlike traditional domains, the availability of high-fidelity simulators in particle physics allows for a focus on data diversity and alignment to improve model performance, suggesting that increasing dataset size can be more beneficial than simply scaling model parameters. This approach is significant for practitioners as it provides a framework for optimizing model training through data engineering rather than solely relying on larger models.
scaling lawspretraininglarge language models