Jensen Huang (NVIDIA CEO): AI Factory, Inference Turning Point, and Trillion Dollar Demand

CN
12 hours ago

Author: Techub News Compilation

During NVIDIA GTC 2026's more than two-hour keynote speech, founder and CEO Jensen Huang painted a grand vision of the future of AI. He not only reviewed the ecological flywheel that has built up over the 20 years since the birth of CUDA but also made a groundbreaking prediction of AI computing demand soaring to trillions of dollars in the coming years and announced the paradigm revolution from "data centers" to "AI factories," from "retrieval-based computing" to "generative computing." This speech was packed with information, covering the complete value chain from chip architecture, system design, software ecosystem to industry applications, and serves as a key window for understanding NVIDIA's strategy and trends in the AI industry.

From CUDA Flywheel to AI Factory: A Fundamental Shift in Computing Paradigms

Jensen Huang opened by emphasizing the magic of the "installed base" accumulated over 20 years of the CUDA platform. He pointed out that it is the hundreds of millions of CUDA GPUs deployed globally that form NVIDIA's deepest moat, attracting developers and spawning breakthrough algorithms like deep learning, which in turn opened up new markets and created today's rapidly spinning "flywheel." This flywheel not only allows NVIDIA's computing platform to support a vast array of applications but, more importantly, it grants the infrastructure an extraordinarily long "useful lifecycle"—because CUDA supports every stage of the AI lifecycle, from data processing to scientific computation, with an extremely broad application range, resulting in a high long-term return on investment for GPUs.

However, the real transformation lies in the shift in computing purpose. Huang put forth a core idea: the future data center is no longer a "place for storing and processing data," but a "factory for producing intelligence." Its product is "Tokens" (inference results), and its revenue is directly tied to the quantity of Tokens produced. He cited a report from industry analysis firm SemiAnalysis, indicating that the key metrics for measuring the efficiency of an AI factory are "Tokens per watt" and "intelligence per inference." The former determines the "capacity" of the factory within a fixed electricity budget, while the latter relates to the "level of intelligence" of the AI.

"This is a completely new perspective," Huang explained, "Every CEO will need to view their business from this perspective in the future. Because your AI factory (data center) is physically limited by land, power, and buildings. Once built, its power limit is fixed. You must ensure that within this power limit, your inference architecture is highly optimized, because Tokens are your new commodity, and computing power directly equates to your revenue."

He further pointed out that in the past two years, the AI field has undergone a fundamental shift. From ChatGPT ushering in the generative AI era, to OpenAI's O1/O3 models demonstrating "inference" capabilities, to Claude Code and other "agentic AI" being able to actually "execute tasks," AI has moved from "perception and understanding" to "generation," and evolved into "inference," ultimately becoming a productivity tool that can "get work done." Each evolution has been accompanied by exponential growth in computing demand. "The inference inflection point has arrived," Huang asserted, "AI now needs to think, to act, to read, and all of this requires inference, requires generating Tokens. This far exceeds the training phase."

He provided a shocking figure: based on a tenfold increase in computing demand due to inference workloads and a hundredfold increase in actual usage, AI's total demand for computing has grown a million times in the past two years. Based on this judgment, he revealed that the current order demand for the Blackwell and Rubin platforms that NVIDIA sees, has surged from a $500 billion forecast at last year's GTC for 2026, to at least $1 trillion for 2027. "I am sure that the actual demand for computing will be much higher than this," he added.

Vertical Integration and Horizontal Openness: NVIDIA's Full Stack Strategy

Faced with such enormous demand and complex application scenarios, Huang elaborated on NVIDIA's unique strategic positioning: "The world's first vertically integrated but horizontally open company." He believes that accelerated computing is essentially not a chip or system problem, but an "application acceleration" issue. After the failure of Moore's Law, the only way to continue achieving significant performance improvements and substantial cost reductions is to accelerate specific fields.

"That is why NVIDIA must deeply engage one field after another, one vertical industry after another," he said, "We are a vertically integrated computing company. We must understand applications, understand fields, fundamentally understand algorithms, and figure out how to deploy these algorithms in any scenario (data centers, cloud, edge, robotic systems)." To this end, NVIDIA has built a complete technology stack covering chips, systems, libraries, frameworks, and application models.

However, vertical integration is not closed-off. Huang emphasized that NVIDIA's success also relies on "horizontal openness"—integrating its technology into any platform of its global partners. "We provide software and libraries to integrate with your technologies, thereby bringing accelerated computing to everyone in the world." He showcased deep cooperation cases with almost all mainstream cloud and data platform vendors, including IBM, Dell, Google Cloud, AWS, Microsoft Azure, and Oracle. For example, NVIDIA's cuDF library accelerated the SQL engine of IBM Watsonx.data, increasing the speed of Nestlé's supply chain data refresh by five times and reducing costs by 83%; in collaboration with Google Cloud, it reduced computing costs for Snapchat by nearly 80%.

This model of "vertical integration + horizontal openness" allows NVIDIA to deliver its extreme optimization capabilities in AI computing to every corner through a vast ecological network. Huang specifically noted that 60% of NVIDIA's business comes from hyperscale cloud providers (including its own AI consumption), while the remaining 40% spans across regional clouds, sovereign clouds, enterprises, industries, robotics, edge, and various fields. "The diversity of AI is where its resilience lies," he pointed out, "This is not a technology for a single application, but a completely new foundational computing platform."

OpenClaw: Igniting the "Agent Operating System" Revolution

In the presentation, Huang dedicated considerable time discussing an open-source project: OpenClaw. He elevated its importance to that of Linux, HTML, and Kubernetes in historical status. "OpenClaw fundamentally opens up the 'agent operating system,'" he excitedly stated, "Just as Windows enabled us to create personal computers, OpenClaw allows us to create 'personal agents.'

He elaborated on OpenClaw's capabilities: it is a generative system that can connect and invoke large language models; manage resources, access tools and file systems; perform task scheduling and timed jobs; decompose a complex user instruction into step-by-step executable subtasks; generate and invoke other sub-agents; and support multimodal interactions. "Every company now needs an open-source strategy, an agent system strategy," Huang asserted, "This is the new computer."

However, enterprise-level deployment faces immense security challenges. Agents can access sensitive information, execute code, and communicate with the outside world within corporate networks. "This is obviously not permissible," Huang acknowledged. To address this, NVIDIA collaborated with OpenClaw founder Peter Steinberger to launch the "NVIDIA OpenClaw Reference Design"—Nemo Claw. It integrates a security technology called "Open Shell," which includes a policy engine, network barriers, and privacy routers, ensuring that agents can operate securely and compliantly within enterprises.

Huang predicted that this transformation will reshape the entire enterprise software industry. "In the future, every SaaS company will become an AgaaS (Agent as a Service) company." The IT architecture of businesses will shift from revolving around "files and data" to center around "agents and Tokens." He even envisioned that future engineer compensation packages will include an "annual Token budget" to amplify their productivity. "Tokens will become the new tool for recruiting in Silicon Valley." And the production of these Tokens will come from the "AI factories" co-constructed by NVIDIA and its partners.

Physical AI and Robotics: Embodiment of Intelligence

In addition to digital world agents, Huang also emphasizes intelligent entities in the physical world—robots. He announced that NVIDIA's robotics ecosystem has brought together nearly all mainstream manufacturers, with 110 robots on display at this GTC. NVIDIA provides three core computing systems for this: training computers, synthetic data generation and simulation computers, and edge computers within the robots themselves.

In the autonomous driving field, Huang announced the arrival of the "ChatGPT moment." "We now know it is possible to achieve self-driving cars." He announced the addition of four new partners to NVIDIA’s "Robotaxi Ready platform": BYD, Hyundai, Nissan, and Geely, which together total an annual production of 18 million vehicles. Additionally, NVIDIA has reached a partnership with Uber to deploy Robotaxi networks in multiple cities.

Even more impressively, Huang demonstrated an interaction with the "Olaf" robot, equipped with NVIDIA Jetson chips and learning to walk through a Newtonian physics solver in Omniverse, in coordination with Disney. This showcased the powerful capabilities of NVIDIA's Isaac Lab robotics training platform, Cosmos world model, and GR00T foundational robot model in creating humanoid robots that can adapt to physical environments.

Huang concluded, stating that NVIDIA, through its open model program, is providing top-tier models across every frontier of AI: Nemotron for language, Cosmos for understanding the physical world, Alpamayo for autonomous driving, GR00T for general robotics, BioNeMo for biology, and Earth-2 for climate predictions. He announced the establishment of the "Nemotron Alliance," uniting top AI companies including Black Forest Labs, Cursor, LangChain, Mistral, Perplexity, and Sarvam to advance the development of the next generation of large language models.

"We are in the midst of a renaissance, a corporate renaissance," Huang said at the end of his speech, "This is a turning point from a $20 trillion industry to a multi-trillion-dollar industry. What will be provided in the future will not just be tools for human use, but specialized, 'rentable' agents in our areas of expertise." In his vision, Tokens produced by AI factories will become the core energy driving the future digital and physical economy, and NVIDIA's full-stack computing platform will be the cornerstone of this grand future.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink