X-trader NEWS
Open your markets potential
NVIDIA releases a new generation of Rubin platform, with inference costs 10 times lower than Blackwell. It has been put into full production and is scheduled to be shipped in the second half of the year.

# Source: Wall Street Insights By Li Dan
The training performance of the Rubin platform is 3.5 times that of Blackwell, its AI software running performance is 5 times faster, and the number of GPUs required for training Mixture-of-Experts (MoE) models is reduced by a factor of 4. Jensen Huang stated that all six Rubin chips have passed key tests demonstrating they can be deployed as scheduled. NVIDIA announced that the platform has entered full-scale production, with cloud service providers such as Amazon AWS, Google Cloud, Microsoft, and Oracle Cloud set to be the first to deploy it.
NVIDIA launched its next-generation Rubin AI platform at CES, marking its annual update cycle in the artificial intelligence (AI) chip sector. With a design featuring six new chips, the platform achieves substantial leaps in inference cost reduction and training efficiency, and will deliver its first batch to customers in the second half of 2026.
On Monday, January 5th, Eastern Time, NVIDIA CEO Jensen Huang said in Las Vegas that all six Rubin chips have been delivered back from manufacturing partners, passed some key tests, and are progressing as planned. He noted that "the AI race has begun, and everyone is striving to reach the next level." NVIDIA emphasized that systems based on Rubin will have lower operating costs than their Blackwell counterparts, as they can achieve the same results with fewer components.
Microsoft and other major cloud computing providers will be among the first customers to deploy the new hardware in the second half of the year. Microsoft’s next-generation Fairwater AI supercluster will be equipped with NVIDIA’s Vera Rubin NVL72 rack-scale systems, which can be scaled up to hundreds of thousands of NVIDIA Vera Rubin superchips. CoreWeave will also be one of the first suppliers to offer Rubin-based systems.
The platform’s launch comes at a time when some on Wall Street are concerned about NVIDIA facing intensifying competition and questioning whether spending in the AI sector can maintain its current pace. However, NVIDIA maintains a long-term bullish forecast, believing the total market size could reach trillions of US dollars.
## Performance Upgrades Target Next-Generation AI Demands
According to NVIDIA’s announcement, the Rubin platform’s training performance is 3.5 times that of its predecessor, Blackwell, and its AI software running performance is 5 times faster. Compared with the Blackwell platform, Rubin can reduce the cost of inference token generation by up to 10 times, and cut the number of GPUs required for training Mixture-of-Experts (MoE) models by a factor of 4.
The Vera CPU equipped on the new platform has 88 cores and doubles the performance of the product it replaces. Designed specifically for agent inference, this CPU is the most energy-efficient processor for large-scale AI factories, featuring 88 custom Olympus cores, full Armv9.2 compatibility, and ultra-fast NVLink-C2C connectivity.
The Rubin GPU is equipped with a third-generation Transformer Engine and hardware-accelerated adaptive compression, delivering 50 petaflops of NVFP4 computing power for AI inference. Each GPU provides 3.6TB/s of bandwidth, while the Vera Rubin NVL72 rack offers 260TB/s of bandwidth.
## Smooth Progress in Chip Testing
Jensen Huang revealed that all six Rubin chips have been returned from manufacturing partners and passed key tests that prove they can be deployed on schedule. This statement reaffirms NVIDIA’s leading position as a manufacturer of AI accelerators.
The platform incorporates five major innovative technologies: the sixth-generation NVLink interconnect technology, the Transformer Engine, confidential computing, the RAS Engine, and the Vera CPU. Among them, the third-generation confidential computing technology makes the Vera Rubin NVL72 the first rack-scale platform to provide data security protection across CPU, GPU, and NVLink domains.
The second-generation RAS Engine spans GPUs, CPUs, and NVLink, featuring real-time health monitoring, fault tolerance, and proactive maintenance functions to maximize system productivity. The rack adopts a modular, cable-free tray design, enabling assembly and maintenance 18 times faster than Blackwell systems.
## Extensive Ecosystem Support
NVIDIA stated that cloud service providers including Amazon AWS, Google Cloud, Microsoft, and Oracle Cloud will be the first to deploy Vera Rubin-based instances in 2026, followed by cloud partners such as CoreWeave, Lambda, Nebius, and Nscale.
Sam Altman, CEO of OpenAI, commented: "Intelligence scales with computing. As we add more computing power, models become more powerful, capable of solving harder problems and delivering greater impact for people. NVIDIA’s Rubin platform helps us continue scaling this progress."
Dario Amodei, Co-founder and CEO of Anthropic, said that the efficiency gains from NVIDIA’s "Rubin platform represent an infrastructure advancement that enables longer memory, better reasoning, and more reliable outputs."
Mark Zuckerberg, CEO of Meta, noted that NVIDIA’s "Rubin platform is expected to deliver step-change improvements in performance and efficiency—exactly what’s needed to deploy state-of-the-art models to billions of people."
NVIDIA also mentioned that Cisco, Dell, Hewlett Packard Enterprise, Lenovo, and Supermicro are expected to launch a variety of servers based on Rubin products. AI labs including Anthropic, Cohere, Meta, Mistral AI, OpenAI, and xAI are looking forward to leveraging the Rubin platform to train larger and more powerful models.
## Early Disclosure of Product Details
Analysts commented that NVIDIA disclosed details of its new products earlier this year than in previous years, as part of its strategy to maintain industry reliance on its hardware. Typically, NVIDIA provides in-depth product details at its annual GTC event held in San Jose, California, in the spring.
For Jensen Huang, CES is just another stop on his marathon of event appearances. He announces products, partnerships, and investments at various events, all aimed at boosting momentum for AI system deployment.
The new hardware unveiled by NVIDIA also includes networking and connectivity components, which will be part of the DGX SuperPod supercomputers and can also be used as standalone products for customers to deploy in a more modular manner. This performance upgrade is essential because AI has shifted toward more specialized model networks that not only process massive volumes of input but also solve specific problems through multi-stage workflows.
NVIDIA is promoting AI applications across the entire economy, including robotics, healthcare, and heavy industry. As part of this effort, NVIDIA announced a suite of tools designed to accelerate the development of autonomous vehicles and robots. Currently, most of the computing spending based on NVIDIA’s technology comes from the capital expenditure budgets of a handful of customers, including Microsoft, Alphabet’s Google Cloud, and Amazon’s AWS.
## Risk Warning and Disclaimer
The market is risky and investment requires caution. This document does not constitute personal investment advice, nor does it take into account the specific investment objectives, financial situations, or needs of individual users. Users should assess whether any opinions, views, or conclusions contained herein are appropriate for their specific circumstances. Any investment made based on this document shall be at the user’s own risk.
Contact: Sarah
Phone: +1 6269975768
Tel: +1 6269975768
Email: xttrader777@gmail.com
Add: 250 Consumers Rd, Toronto, ON M2J 4V6, Canada