Research
Want to build a custom model
The article discusses an individual's intention to build a small custom model, approximately 25 million parameters in size, primarily for next-token prediction rather than full chat responses. The author highlights the importance of data, noting the need for around 100 million tokens for effective training, and considers various specialized training datasets, including comedy and technical domains. This project serves as a practical learning experience for practitioners interested in model development and the challenges of dataset acquisition for specific tasks.
modelcustomarchitecture