Models
Surpassing Scale by Efficiency: A Compact 135M Parameter Foundational LLM Natively Adapted for the Bangla Language
The article introduces bangla-smollm-135m, a 135-million parameter decoder-only foundational model optimized for the Bangla language, designed for efficient deployment on low-resource systems. It employs a deterministic intersect-and-append token merging strategy to address subword script fragmentation while maintaining parameter stability. In zero-shot multi-task evaluations, this model matches or exceeds the performance of larger models, achieving results comparable to those of 1 billion parameter models, thus offering a viable solution for practitioners working with low-resource languages.
llmbanglaefficient