Nvidia announced that its latest AI server can improve the performance of mixture-of-experts AI models by up to ten times, including high-profile models from China such as Moonshoot’s Kimi K2 Thinking and DeepSeek’s open-source models. Mixture-of-experts models divide tasks among specialized components within a model, improving efficiency and gaining popularity in 2025.
Nvidia’s server integrates 72 of its top GPUs with high-speed links, providing advantages in serving AI models to end users, even as these models may require less training on Nvidia hardware. The company emphasizes that the performance gains stem from both the number of chips and the rapid interconnects, areas where Nvidia maintains an edge over competitors.
While Nvidia dominates AI training markets, it now faces growing competition in inference and deployment from firms like Advanced Micro Devices (AMD) and Cerebras. AMD is developing a multi-chip AI server expected to launch next year. Nvidia’s new data highlights its continued focus on scaling AI infrastructure to handle increasingly complex models efficiently.