Jensen Huang (NVIDIA CEO): AI Factory, Agent Revolution, and the Trillion-Level Computing Power Future

CN
17 hours ago

Author: Techub News Compilation

At NVIDIA's annual technology conference GTC 2026, founder and CEO Jensen Huang delivered a 140-minute keynote speech. This talk was not only a summary of the explosive growth of AI over the past year but also a forward-looking declaration regarding the computing paradigms, infrastructures, and business models for the next decade. In the face of an exponentially rising global demand for AI computing power, Huang elaborated on how NVIDIA, through a unique strategy of vertical integration and horizontal openness, is constructing a full-stack platform from chips, systems, libraries to applications to welcome what he refers to as the "trillion-dollar" scale AI infrastructure era.

Twenty Years of CUDA: The Victory of Accelerating Flywheels and Vertical Integration

The speech began with a milestone—the 20th anniversary of the CUDA architecture. Huang emphasized that CUDA is far more than just a programming model; it is the cornerstone of all NVIDIA's strategies. Over the past twenty years, NVIDIA has amassed a vast installed base of "hundreds of millions" of GPUs running CUDA, which is the core driving force behind its ecological flywheel's accelerating spin.

The logic of this flywheel is: a large installed base attracts developers globally, developers create new algorithms (such as deep learning), breakthroughs in algorithms give rise to entirely new markets, and these new markets attract more companies to join the ecosystem, further expanding the installed base. Huang pointed out that today, the download rate of NVIDIA libraries is growing at an "unprecedented speed," enabling CUDA-based computing platforms to support massive applications and continue breakthroughs.

Even more importantly, this broad applicability has greatly extended the "useful life" of NVIDIA's infrastructure. Huang cited the Ampere architecture released six years ago, noting its cloud leasing prices are rising rather than falling, thanks to the extreme extension of application reach under the CUDA ecosystem. He summarized, "The large installed base, the flywheel operating at high speed, the broad reach of developers, combined with our continuous software updates, leads to declining computing costs." This "accelerated computing" resulting in performance leaps and continuous cost optimizations constitutes the core dynamic of NVIDIA's architectural expansion.

Huang defined NVIDIA as the "world's first vertically integrated but horizontally open company." He explained that accelerated computing is fundamentally "application acceleration." With the slowing of Moore's Law, to continue achieving significant performance improvements and cost reductions, a path of targeted domain acceleration must be taken. This means NVIDIA must deeply understand the applications and algorithms across various industries and optimize for different deployment scenarios from data centers, clouds, edges to robotic systems, which is "vertical integration." Meanwhile, "horizontal openness" is reflected in NVIDIA's integration of its technology and software libraries into any platform of global partners, bringing accelerated computing to the world.

The Inflection Point of Inference Has Arrived: AI Factories and "Token Economics"

The core turning point of the speech was the judgment on the current state of AI development: the "inference inflection point" of AI has arrived. Huang reviewed the three waves of change over the past two years that have driven qualitative transformation in the industry: ChatGPT initiated the era of generative AI; then models like OpenAI's o1 brought about "inference" capabilities, allowing AI to reflect, plan, and become credible and fact-based; followed by the emergence of models like Claude Code that enable AI to use tools and perform tasks, moving from "thinking" to "doing."

"AI must now think, act, read, and reason—every time requires reasoning," Huang said. This has led to a fundamental change in demand for computing. He provided a staggering figure: in the past two years, the demand for computing power for AI workloads has increased by about "ten thousand times," while usage has grown by about "one hundred times," making him feel that the overall computing demand has increased by "one million times."

This inflection point fundamentally changes the value positioning of data centers. Huang introduced two key concepts: "AI factories" and "tokens are the commodities of the new era". The future data center will no longer be a warehouse for storing files but a factory for producing intelligent tokens. As data centers are limited by power, their core metrics shift to "the number of tokens generated per watt of power" and "inference speed" (i.e., the rate of token generation). The former relates to the total output of the factory, while the latter directly impacts the intelligence level of AI (processing longer contexts and deeper thoughts).

Huang presented what he termed "the most important chart of future AI factories," with the horizontal axis representing token speed (level of intelligence) and the vertical axis representing throughput (production). He drew an analogy to commercial pricing: slow speed with high throughput could serve the free service tier; medium speed with medium throughput corresponded to moderate pricing; while high-speed, high-intelligence models that could handle millions of token contexts would align with premium pricing. He provided an example where a research team using 50 million tokens daily would face significant token costs at $150 per million tokens.

He announced that the NVLink 72 system based on the Grace Blackwell architecture achieved a 35-fold increase in throughput at key high-value service levels compared to the previous Hopper architecture, reducing token costs to the lowest in the world. He quoted third-party analysis stating, "Even if a competitor's architecture is free, it is not cheap enough," as the fixed costs of building a gigawatt-class data center can reach about $40 billion, which must be filled with the most efficient computing systems to achieve optimal token costs. NVIDIA, with its "extreme code design" and full-stack optimization, is currently "unmatched" in this regard.

The Vera Rubin Platform and the Agent Operating System Revolution

To support the computing needs of the agent era, Huang launched the next-generation platform—Vera Rubin. This is no longer a standalone chip but a vertically integrated giant system that includes seven types of chips and five rack-scale computers, specifically designed for agent AI.

Key breakthroughs of the Vera Rubin platform include: sixth-generation NVLink 72 interconnect technology achieving 130TB/s full-to-full bandwidth; 100% liquid cooling design using 45 degrees Celsius hot water, significantly reducing data center cooling energy consumption; tight integration with Groq's LPU (Language Processing Unit) rack as a token accelerator; and the world's first Spectrum-X switches using co-packaged optics (CPO) technology. Huang stated that this platform has increased inference throughput at critical tiers by another factor of 10.

If the Vera Rubin provides the "hardware foundation" for the agent era, then the emergence of OpenClaw brings about a "software revolution." Huang passionately introduced this "fastest-growing open-source project in human history." He likened OpenClaw to the "operating system for agent computers," like Windows for PCs, providing core functionalities such as resource management, tool invocation, task decomposition, sub-agent scheduling, and multi-modal I/O.

"Now, every CEO of a company worldwide needs to consider: what is your OpenClaw strategy?" Huang asserted, suggesting that just as companies needed Linux strategies, HTML strategies, and Kubernetes strategies in the past, having an agent strategy has become inevitable. He predicted that every SaaS (Software as a Service) company in the future will transform into a GaaS (Agent as a Service) company.

However, enterprise-level agents face severe challenges in security and privacy. To address this, NVIDIA partnered with the founders of OpenClaw to launch the enterprise-level security solution "Open Shell," which has been integrated into the reference design "Nemo Claw." This solution includes a policy engine, protective barriers, and privacy routers, ensuring agents operate securely within enterprises. Additionally, NVIDIA announced the "Open Model Plan," releasing six cutting-edge model families including Neotron (language), Cosmos (physical world), Groot (robots), BioNeo (biology), and Earth-2 (climate), and formed the "Neotron Alliance" with many star companies including Black Forest Labs, Cursor, LangChain, Mistral AI, and Perplexity to advance the open-source model ecosystem.

The Rise of Physical AI: From Autonomous Driving to Disney Robots

Agents not only exist in the digital world but are also merging into the physical world in the form of robots. Huang announced that the "ChatGPT moment" for autonomous driving has arrived. NVIDIA's autonomous driving platform Drive Alpamo now possesses inference capabilities. He announced that new partners including BYD, Hyundai, Nissan, and Geely have joined its "robotic taxi readiness platform," with a combined annual output of 18 million vehicles. Additionally, NVIDIA has partnered with Uber to deploy autonomous driving fleets in multiple cities.

In the broader robotics field, Huang stated that nearly all major robotics companies globally are collaborating with NVIDIA. Over 110 robots were showcased on-site. He emphasized the Isaac Lab platform for robot training and simulation, along with the Warp and Newton libraries for differentiable physics simulation. The most striking demonstration came from Disney: its research team trained character robots, including Olaf from "Frozen," using NVIDIA's Newton physics simulator and Isaac Lab, enabling them to adapt to the physical world and interact independently, heralding a new experience for future theme parks.

At the end of the speech, Huang concluded with a rap video that incorporated all key information, reaching a climax in the atmosphere. He reiterated that we are at the beginning of a massive platform migration, where AI factories, the agent revolution, and physical AI will shape the future together. In the face of the future demand for AI infrastructure reaching "at least one trillion dollars" by 2027, NVIDIA is committed to providing full-stack power from underlying chips to top-tier agents through its strategy of vertical integration and horizontal openness.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink