Meta: Can afford trillion-level computing power, but cannot retain key people.

The AI arms race in Silicon Valley has never lacked super buyers waving checks; what it lacks is knowing how to refine the future with this computing power.

Written by: Ada, Deep Tide TechFlow

Pang Ruoming left before he even warmed his seat at Meta.

In July 2025, Zuckerberg snatched this highly sought-after Chinese engineer in the AI infrastructure field from Apple with a multi-year compensation package totaling over $200 million. Pang Ruoming was assigned to Meta's Super Intelligence Lab, responsible for building the infrastructure for next-generation AI models.

Seven months later, OpenAI poached him.

According to The Information, OpenAI waged a recruitment offensive for several months. Although Pang had told colleagues that he was "very happy working at Meta," he ultimately chose to leave. Bloomberg reported that his compensation plan at Meta was tied to milestones, and leaving early meant giving up most of his unvested equity.

$200 million cannot buy seven months of loyalty.

This is not just a simple job-hopping story.

One person's departure, a signal for many

Pang Ruoming was not the first to leave.

Last week, Mat Velloso, the head of the developer platform for Meta's Super Intelligence Lab, also announced his departure; he had joined Meta from Google's DeepMind in July of last year and stayed for less than eight months. Prior to that, in November 2025, Yann LeCun, a Turing Prize laureate and chief AI scientist who had been at Meta for 12 years, announced leaving to start his own venture, which was focused on the "world model" he had always advocated. Russ Salakhutdinov, a core disciple of Geoffrey Hinton and vice president of generative AI research at Meta, also recently publicly announced his departure.

To understand the talent drain at Meta AI, one must first grasp how damaging Llama 4 really is.

In April 2025, Meta prominently announced the release of the Llama 4 series with the Scout and Maverick models. The official paper data was dazzling, claiming to have comprehensively suppressed GPT-4.5 and Claude Sonnet 3.7 in key benchmark tests like MATH-500 and GPQA Diamond.

However, this flagship model that carried Meta's ambitions quickly "showed its true colors" in independent blind tests within the open-source community, revealing a sharp dropout in actual generalization and reasoning abilities compared to its claims. Faced with strong skepticism from the community, chief AI scientist Yann LeCun ultimately admitted that the team "used different versions of the model for different test sets to optimize the final scores" during the testing phase.

In the rigorous AI academic and engineering circles, this crossed an unforgivable red line. In other words, the team turned Llama 4 into a "small-town test-taker" that could only handle past exam questions, rather than a top student with genuine cutting-edge intelligence. Show a math exam for math, show a programming exam for programming; each individual test appears strong, but in reality, it is not the same model.

This is called "cherry-picking" in the AI academic world and "substitution" in the exam-oriented education sector.

For Meta, which has always prided itself as a "beacon of open source," this turmoil directly destroyed its most valuable trust asset in the developer ecosystem. Its immediate cost was Zuckerberg's "complete loss of confidence" in the engineering baseline of the original GenAI team, thus setting the stage for the subsequent parachuting of executives and undermining of core infrastructure departments.

He spent between $14.3 billion and $15 billion to acquire 49% of the data annotation company Scale AI, parachuting its 28-year-old CEO Alexandr Wang as Meta's chief AI officer, establishing the Meta Super Intelligence Lab (MSL). Turing Prize winner LeCun was required to report to this 28-year-old in the new structure. In October, Meta cut about 600 positions at MSL, including members of the FAIR research department that LeCun had established.

The flagship model Llama 4 Behemoth, originally planned for release in the summer of 2025, was also repeatedly postponed from summer to fall and ultimately indefinitely shelved.

Meta turned to develop the next-generation text model codenamed "Avocado" and the image/video model codenamed "Mango." Reportedly, Avocado aims to compete with GPT-5 and Gemini 3 Ultra. Originally scheduled for delivery by the end of 2025, it has been pushed to the first quarter of 2026 due to subpar performance testing and training optimization. Meta is considering releasing it as closed-source, abandoning the Llama series' consistent open-source tradition.

Meta has made two fatal mistakes concerning AI models. The first is the fabrication of benchmarks, which directly destroyed trust in the developer community; the second is forcing a foundational research department like FAIR, which takes a decade to refine, into a product organization driven by quarterly KPIs. Together, these two matters are the fundamental reason for the current talent drain.

Self-developed chips: another broken leg

While talent is fleeing, there are also problems with chips.

According to The Information, Meta cut its internally developed advanced AI training chip project last week.

Meta's self-developed chip plan is called MTIA (Meta Training and Inference Accelerator). The company's initial roadmap was ambitious: MTIA v4, codenamed "Santa Barbara," v5, codenamed "Olympus," and v6, codenamed "Universal Core," were planned for delivery between 2026 and 2028. Olympus was designed to be Meta's first chip based on the 2nm chiplet architecture, aimed at covering both high-end model training and real-time inference, ultimately replacing Nvidia in Meta's training clusters.

Now, this advanced training chip has been cut.

Meta has not made no progress; MTIA has achieved some results on the inference side. The MTIA v3 inference chip, codenamed "Iris," has been deployed on a large scale in Meta's data centers, primarily used for Facebook Reels and Instagram's recommendation system, reportedly reducing overall ownership costs by 40% to 44%. However, inference and training are two different things. Inference runs models; training builds models. Meta can produce inference chips, but it cannot produce training chips that can compete directly with Nvidia.

This is not the first time in history. In 2022, Meta attempted to develop its own inference chips but directly abandoned them after failing in small-scale deployments and turned to placing large orders with Nvidia.

The setbacks in self-developed chips have directly accelerated Meta's buying spree.

$135 billion of panic purchasing

In January 2026, Meta announced its capital expenditure budget for this year would be between $115 billion and $135 billion, almost double last year's $72.2 billion. A significant portion of this money will be spent on chips.

Within ten days, three major deals were signed:

On February 17, Meta signed a multi-year, cross-generational strategic cooperation agreement with Nvidia. Meta will deploy "millions" of Nvidia Blackwell and next-generation Vera Rubin GPUs, along with Grace standalone CPUs. Analysts estimate the transaction size to be in the hundreds of billions of dollars, making Meta the first supercomputing customer worldwide to deploy Nvidia Grace standalone CPUs on a large scale.

On February 24, Meta signed a multi-year chip agreement with AMD worth $60 billion to $100 billion. Meta will purchase AMD's latest MI450 series GPUs and sixth-generation EPYC CPUs. As part of the deal, AMD issued Meta warrants for up to 160 million shares of common stock, equivalent to about 10% of AMD's shares, at a price of $0.01 per share, vesting in batches based on delivery milestones.

On February 26, according to The Information, Meta signed a multi-billion-dollar multi-year agreement with Google to lease Google Cloud's TPU chips to train and run its next-generation large language models. At the same time, both sides are discussing a direct purchase of TPU by Meta to deploy in its own data centers starting in 2027.

A social media company placed orders with three chip suppliers totaling possibly over $100 billion within ten days.

This is not diversification. This is panic purchasing.

The three-layer logic of computing power anxiety

Why is Meta in such a rush?

First, self-developed chips can no longer be relied upon. The cutting of the most advanced training chip project means that Meta will have to rely on external purchases to meet AI training needs in the foreseeable future. The inference-side MTIA chips can handle mature businesses like recommendation systems, but to train cutting-edge models like Avocado, which aims to compete with GPT-5, Nvidia or comparable hardware is essential.

Second, competitors will not wait. OpenAI has already secured massive resources from Microsoft, SoftBank, and the UAE sovereign fund. Anthropic has locked in a supply of 1 million TPU and Trainium chips each from Google and Amazon. Google's Gemini 3 has been completely trained on TPU. If Meta cannot secure enough computing power, it will not be able to even maintain its entry ticket in the race.

Third, and perhaps most fundamentally, Zuckerberg needs to use "purchasing power" to compensate for the lack of "research and development capacity." The crash of Llama 4, the loss of key talent, and setbacks in self-developed chips have combined to weaken Meta's AI narrative in front of Wall Street. Signing substantial deals with Nvidia, AMD, and Google at this moment at least sends a signal: we have money, we are buying, we have not given up.

Meta's current strategy is to throw money at hardware when it cannot manage software and to buy chips when it cannot retain people. However, the AI competition is not a game won by writing checks. Computing power is a necessary condition, not a sufficient condition. Without a top-level model team and a clear technological path, even the most chips are just expensive stockpiles sitting in a warehouse.

The buyer's dilemma

Looking back at Meta's three transactions in February, an interesting detail has been overlooked by most people.

Meta is buying Nvidia's current Blackwell and future Vera Rubin; the deal with AMD is for MI450 and future MI455X; with Google, it is currently leasing Ironwood TPU, planning to make direct purchases next year.

Three suppliers, three completely different hardware architectures and software ecosystems.

This means that Meta has to repeatedly jump between the distinct underlying ecosystems of Nvidia's CUDA, AMD's ROCm, and Google's XLA/JAX. While having a multi-supplier strategy can mitigate supply chain risks and reduce hardware procurement premiums, it will bring exponentially increased engineering complexity.

This is precisely Meta's most fatal vulnerability. Training a trillion-parameter model efficiently across these three fundamentally different hardware ecosystems requires not just engineers who understand CUDA but architects who can build cross-platform training frameworks from scratch.

There may be fewer than 100 such individuals in the world. Pang Ruoming is one of them.

Spending $100 billion to buy the world's most complex hardware combination while losing the brains that can master these hardware is the most surreal aspect of Zuckerberg's gamble.

Zuckerberg's gamble

Pulling the lens back a bit, Zuckerberg's operational path regarding AI over the past 18 months is strikingly similar to his All In bet on the metaverse:

Seeing the trend, investing heavily, rapidly hiring, encountering setbacks, making quick strategic shifts, and investing heavily again.

From 2021 to 2023, it was the metaverse, resulting in annual losses of tens of billions, with stock price dropping from $380 to $88. From 2024 to 2026, it's AI, again spending recklessly, frequent organizational restructuring, and the same narrative of "trust me, I have vision."

The difference is that this time the AI boom is indeed much more tangible than the metaverse. And Meta has money to burn; its advertising business generates abundant cash flow, with fourth-quarter revenues for 2025 reaching $59.9 billion, up 24% year-on-year.

The problem is: money can buy chips, buy computing power, even buy people sitting in seats, but it cannot buy people to stay.

Pang Ruoming chose OpenAI, Russ Salakhutdinov chose to leave, and LeCun chose to start his own venture.

Zuckerberg's current bet is that as long as he buys enough chips, builds enough data centers, and spends enough money, he can always find or cultivate people who can utilize these resources.

This bet may hold. Meta is, after all, one of the wealthiest tech companies in the world, with over $100 billion in operating cash flow, which is its most solid moat. From OpenAI to Anthropic, from Google to other competitors, Meta is continuously poaching talent. According to Quantum Bit, nearly 40% of Meta's Super Intelligence team of 44 members come from OpenAI.

However, the harsh reality of the AI competition is that computing power reserves, talent lists, and model performances are all publicly known. The benchmark fabrication incident involving Llama 4 proves that in this industry, you cannot maintain your lead by relying on PPT and PR.

The market ultimately recognizes one thing: is your model good enough?

The position in the food chain

As the AI arms race enters 2026, the ranking in the food chain has begun to clarify:

At the top are OpenAI and Google. OpenAI has the strongest models, the largest user base, and the most aggressive financing. Google boasts complete vertical integration with in-house chips, in-house models, and in-house cloud infrastructure. Anthropic follows closely behind, comfortably in the first tier thanks to the product strength of the Claude model and dual supply lines of computing power from Google and Amazon.

Meta? It has spent the most money, signed the most chip contracts, and undergone the most frequent organizational restructurings, yet so far has not presented a single cutting-edge model that can convince the market.

Meta's AI story is somewhat reminiscent of Yahoo in 2005. At that time, Yahoo was also one of the richest companies on the internet, frantically acquiring and spending money but failing to create a search engine like Google. Money is not omnipotent. Zuckerberg needs to clarify what Meta really wants to achieve in AI, rather than just jumping on every trend.

Of course, it is too early to write Meta's obituary. With 3.58 billion monthly active users, $59.9 billion in quarterly revenue, and the world's largest social data set, these are assets that any competitor would find difficult to replicate.

If the next-generation model codenamed Avocado can be delivered as planned in 2026 and rejoin the top tier, all of Zuckerberg's spending and restructuring will be packaged as "a strategic move to turn the tide." But if it once again falls short of expectations, then the $135 billion spent will only result in warehouses full of heated silicon wafers.

After all, the AI arms race in Silicon Valley has never lacked super buyers waving checks; what it lacks is knowing how to refine the future with this computing power.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。