Training
I pretrained and post trained a 500M parameter LLM and 330M parameter Image generator from scratch
A new 500M parameter LLM, named HobbyLM, and a 330M parameter image generator have been pretrained and post-trained from scratch. The LLM was trained on 40B tokens, with post-training aimed at extending its context window, while the image generator architecture is based on ByteDance's Dreamlite and utilizes a mixture of datasets. This development is significant for practitioners as it provides open-source models and code for training and inference, facilitating experimentation and further research in LLM and image generation capabilities.
pretrainingpost_trainingllm