NVIDIA and Google Cloud have expanded their long-standing partnership with a new set of AI infrastructure and platform updates unveiled at the Google Cloud Next conference in Las Vegas. The announcements focus on scaling “AI factories” and enabling enterprise deployment of agentic and physical AI systems.
The collaboration introduces new infrastructure, including A5X bare-metal instances powered by NVIDIA’s next-generation Vera Rubin architecture, alongside expanded support for Gemini models running on NVIDIA Blackwell GPUs. The companies aim to provide a fully integrated stack, from chips and networking to software and cloud services, designed for high-performance AI workloads.
The updates reflect growing demand for infrastructure capable of supporting advanced AI systems that can operate autonomously and interact with real-world environments.
Next-Generation AI Infrastructure
At the core of the announcement is the A5X platform, built on NVIDIA’s Vera Rubin NVL72 systems. Google said the new infrastructure delivers up to 10 times lower inference cost per token and 10 times higher throughput compared to previous generations.
The system is designed to scale to massive clusters, supporting up to 80,000 GPUs in a single site and nearly one million GPUs across multiple locations. This enables enterprises to train and deploy large-scale AI models, including multimodal and reasoning systems.
Google Cloud’s broader Blackwell portfolio also includes a range of virtual machine configurations, allowing customers to scale from fractional GPU usage to full rack-scale deployments depending on workload requirements.
Secure and Distributed AI Deployment
The partnership also emphasizes security and flexibility. Gemini models can now run on Google Distributed Cloud with NVIDIA Blackwell GPUs, allowing organizations to deploy AI closer to sensitive data environments.
Confidential computing capabilities ensure that prompts, training data, and model outputs remain encrypted, even from infrastructure operators. This is particularly relevant for regulated industries such as finance, healthcare, and government.
New confidential virtual machines extend these protections to public cloud environments, offering secure access to high-performance AI resources without compromising data privacy.
Advancing Agentic and Physical AI
NVIDIA and Google Cloud are also targeting the next wave of AI applications, including autonomous agents and physical systems such as robots and digital twins. The platform supports a wide range of models, from Google’s Gemini family to NVIDIA’s open Nemotron models, enabling developers to build systems that can reason, plan, and act.
Integration with tools like NVIDIA Omniverse and Isaac Sim allows developers to simulate real-world environments and train robotics systems before deployment. This opens up use cases in manufacturing, logistics, and industrial automation.
Companies including OpenAI, Salesforce, and Snap are already using the infrastructure for tasks ranging from large-scale inference to data processing and simulation.
From Experimentation to Production
The expanded platform is designed to help organizations move AI projects from experimentation to production. Startups and enterprises are using the combined infrastructure to build applications in areas such as software development, drug discovery, and real-time analytics.
With more than 90,000 developers already participating in the joint ecosystem, the partnership highlights the scale at which AI infrastructure is evolving. As demand for compute and advanced models continues to grow, collaborations like this are shaping the foundation for the next generation of AI systems.