Models
Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
The Nemotron 3 Ultra model has been released, featuring a total of 550 billion parameters, with 55 billion active parameters, and employing a Mixture-of-Experts Hybrid Mamba-Attention architecture. It was pre-trained on 20 trillion tokens and supports a context length of up to 1 million tokens, achieving approximately 6x higher inference throughput compared to current leading LLMs while maintaining comparable accuracy. This model's advanced capabilities, including LatentMoE and multi-environment reinforcement learning, make it particularly suitable for long-duration autonomous reasoning tasks, and its open-source availability on HuggingFace provides valuable resources for practitioners in the AI field.
mixture-of-expertslanguage modelagentic reasoning