NVIDIA Vera Rubin platform
Those that anticipated NVIDIA CEO Jensen Huang would delay delivering an replace on its subsequent large AI chip — the Vera Rubin processor first mentioned final March on the firm’s GTC convention in San Jose — till the upcoming GTC convention in March had been stunned final night time when Huang launched particulars concerning the chip final night time at CES in Las Vegas, saying the brand new chip is in “full manufacturing” and might be accessible the second half of this yr.
Amongst NVIDIA’s hallmarks tat differ from tech firm conduct of the previous is to ship new merchandise on time or forward of schedule, whereas pursuing a roadmap freed from the worry of “cannibalism,” the priority that new merchandise will eat into potential income of present merchandise nonetheless available on the market. Whereas NVIDIA might, certainly, not have squeezed each greenback out of Vera Rubin’s predecessors, the corporate’s red-hot product cadence has put monumental stress on its rivals whereas additionally delivering large volumes of chips to a market sector with fixed demand for the latest-and-greatest chips no matter how quickly they’re rolled out: the hyperscalers and AI cloud firms.
Of Vera Rubin, Huang positioned it final night time as a blow-out performer, delivering 5x the AI compute of the present Grace Blackwell flagship chip.
NVIDIA stated the Rubin platform makes use of excessive codesign throughout six chips — the NVIDIA Vera CPU, NVIDIA Rubin GPU, NVIDIA NVLink 6 Swap, NVIDIA ConnectX-9 SuperNIC, NVIDIA BlueField-4 DPU and NVIDIA Spectrum-6 Ethernet Swap — that collectively lower coaching time and inference token prices, based on the corporate.
“Rubin arrives at precisely the fitting second, as AI computing demand for each coaching and inference goes by the roof,” stated Huang. “With our annual cadence of delivering a brand new era of AI supercomputers — and excessive codesign throughout six new chips — Rubin takes a large leap towards the following frontier of AI.”
Named for astronomer Vera Florence Cooper Rubin, the platform options the NVIDIA Vera Rubin NVL72 rack-scale resolution and the NVIDIA HGX Rubin NVL8 system.
NVIDIA stated the platform introduces 5 improvements, together with the newest generations of NVIDIA NVLink interconnect know-how, Transformer Engine, Confidential Computing and RAS Engine, in addition to the NVIDIA Vera CPU.
“These breakthroughs will speed up agentic AI, superior reasoning and massive-scale mixture-of-experts (MoE) mannequin inference at as much as 10x decrease value per token of the NVIDIA Blackwell platform,” the corporate stated in its announcement. “In contrast with its predecessor, the NVIDIA Rubin platform trains MoE fashions with 4x fewer GPUs to speed up AI adoption.”
Jensen Huang
Vera Rubin is designed to deal with the rising adoption of agentic AI and reasoning fashions, that are pushing the bounds of computation. Multistep problem-solving requires fashions to course of, purpose and act throughout lengthy sequences of tokens. The Rubin platform’s 5 applied sciences embody:
- Sixth-Era NVIDIA NVLink: Delivers GPU-to-GPU communication required for MoE fashions. Every GPU affords 3.6TB/s of bandwidth, whereas the Vera Rubin NVL72 rack supplies 260TB/s — which NVIDIA stated is extra bandwidth than the whole web. With built-in, in-network compute for collective operations, in addition to newfeatures for serviceability and resiliency, NVLink 6 change is constructed for AI coaching and inference at scale.
- Vera CPU: Designed for agentic reasoning, Vera is essentially the most energy‑environment friendly CPU for large-scale AI factories, NVIDIA stated. It’s constructed with 88 NVIDIA customized Olympus cores, Armv9.2 compatibility and ultrafast NVLink-C2C connectivity.
- Rubin GPU: That includes a third-generation Transformer Engine with hardware-accelerated adaptive compression, Rubin GPU delivers 50 petaflops of NVFP4 compute for AI inference.
- Third-Era NVIDIA Confidential Computing: The corporate stated Vera Rubin NVL72 is the primary rack-scale platform to ship NVIDIA Confidential Computing — which maintains knowledge safety throughout CPU, GPU and NVLink domains.
- Second-Era RAS Engine: The Rubin platform options well being checks, fault tolerance and proactive upkeep. The rack’s modular, cable-free tray design permits as much as 18x quicker meeting and servicing than Blackwell.
NVIDIA Rubin introduces NVIDIA Inference Context Reminiscence Storage Platform, which the corporate stated is a brand new class of AI-native storage infrastructure designed to scale inference context at gigascale.
Powered by NVIDIA BlueField-4, the platform permits sharing and reuse of key-value cache knowledge throughout AI infrastructure, designed to enhance responsiveness and throughput.
As AI factories more and more undertake bare-metal and multi-tenant deployment fashions, sustaining robust infrastructure management and isolation turns into important. BlueField-4 additionally introduces Superior Safe Trusted Useful resource Structure, or ASTRA, a system-level structure that provides AI infrastructure builders a single management level to provision, isolate and function large-scale AI environments with out compromising efficiency.
With AI purposes evolving towards multi-turn agentic reasoning, AI-native organizations handle and share bigger volumes of inference context throughout customers, periods and providers. NVIDIA Vera Rubin NVL72 is designed to supply a unified system that mixes 72 NVIDIA Rubin GPUs, 36 NVIDIA Vera CPUs, NVIDIA NVLink 6, NVIDIA ConnectX-9 SuperNICs and NVIDIA BlueField-4 DPUs.
NVIDIA stated it is going to additionally supply the NVIDIA HGX Rubin NVL8 platform, a server board that hyperlinks eight Rubin GPUs by NVLink to help x86-based generative AI platforms. The HGX Rubin NVL8 platform accelerates coaching, inference and scientific computing for AI and high-performance computing workloads.
NVIDIA DGX SuperPOD serves as a reference for deploying Rubin-based programs at scale, integrating both NVIDIA DGX Vera Rubin NVL72 or DGX Rubin NVL8 programs with NVIDIA BlueField-4 DPUs, NVIDIA ConnectX-9 SuperNICs, NVIDIA InfiniBand networking and NVIDIA Mission Management software program.
