Open Source
Get in here: Community model build thread
The article discusses a community-driven initiative to create a mixture of experts (MoE) model through a crowdsourced training approach called "Branch-Train-Stitch." Participants will independently train a distributed prototype model, potentially sized at 2B or 7B parameters, on their hardware and submit narrow-domain submodels for integration into a larger MoE, which may reach sizes between 500B and 1T parameters. This approach allows for broader participation while addressing the challenges of model size and training logistics, making it relevant for practitioners interested in collaborative model development and efficient use of distributed compute resources.
community-modelcrowdsourcedtraining