Google Rolls out Ironwood TPU With 9,216-Chip Pods and Liquid Cooling

CN
1 day ago

Google previewed Ironwood at Google Cloud Next ’25 in April and is now widening access, positioning the chip as custom silicon tuned for the “age of inference,” when models are expected to respond, reason, and generate in real time across global cloud regions.

According to a CNBC report, the move folds squarely into a broader power play among hyperscalers racing to own the AI stack from data center to dev toolkit. Under the hood, Ironwood leans on a 3D torus interconnect, liquid cooling for sustained loads, and an improved Sparsecore to accelerate ultra-large embeddings for ranking, recommendations, finance, and scientific computing.

It is engineered to minimize data movement and communication bottlenecks—two culprits that often cap throughput in multi-chip jobs. The raw numbers are designed to turn heads: up to 4,614 TFLOPs (FP8) per chip, 192 GB of HBM with 7.37 TB/s bandwidth, and 1.2 TB/s bidirectional inter-chip bandwidth. Pods scale from 256 chips to a 9,216-chip configuration delivering 42.5 exaflops (FP8) of compute, with full-pod power draw around 10 MW and liquid cooling enabling significantly higher sustained performance than air.

Google says Ironwood is more than 4× faster than the prior Trillium (TPU v6) in overall AI throughput and offers roughly 2× better performance per watt—while clocking nearly 30× the power efficiency of its first Cloud TPU from 2018. In maxed-out form, the company claims a computational edge over top supercomputers such as El Capitan when measured at FP8 exaflops. As always, methodology matters, but the intent is clear.

While it can train, Ironwood’s pitch centers on inference for large language models and Mixture-of-Experts systems—exactly the high-QPS, low-latency work now flooding data centers from North America to Europe and Asia-Pacific. Think chatbots, agents, Gemini-class models, and high-dimension search and recsys pipelines that demand fast memory and tight pod-scale sync.

Integration arrives through Google Cloud’s AI Hypercomputer—pairing the hardware with software like Pathways to orchestrate distributed compute across thousands of dies. That stack already backs consumer and enterprise services from Search to Gmail, and Ironwood slots in as an upgrade path for customers that want a managed, TPU-native route alongside GPUs.

There is a market message baked in: Google is challenging Nvidia’s dominance by arguing that domain-specific TPUs can beat general-purpose GPUs on price-performance and energy use for certain AI tasks. CNBC’s report says early adopters include Anthropic, which plans deployments at million-TPU scale for Claude—an eyebrow-raising signal of how big inference footprints are becoming.

Alphabet CEO Sundar Pichai framed demand as a key revenue driver, citing a 34% jump in Google Cloud revenue to $15.15 billion in Q3 2025 and capex tied to AI buildout totaling $93 billion. “We are seeing substantial demand for our AI infrastructure products… and we are investing to meet that,” he said, noting more billion-dollar deals were signed this year than in the prior two combined.

Ironwood’s broader availability is slated for later in 2025 through Google Cloud, with access requests open now. For enterprises in the U.S., Europe, and across Asia-Pacific weighing power budgets, rack density, and latency targets, the question is less about hype and more about whether Ironwood’s pod-scale FP8 math and cooling profile line up with their production workloads.

  • Where will Ironwood be available? Through Google Cloud in global regions, including North America, Europe, and Asia-Pacific.
  • When does access begin? Wider availability starts in the coming weeks, with broader rollout later in 2025.
  • What workloads is it built for? High-throughput inference for LLMs, MoEs, search, recommendations, finance, and scientific computing.
  • How does it compare with previous TPUs? Google cites 4× higher throughput and 2× better performance per watt than Trillium.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink