Claude Fable's pricing has doubled, top AI is becoming a luxury item.

CN
PANews
Follow
10 hours ago

On June 9, 2026, Anthropic released Claude Fable 5, a new model focused on complex code planning and generation. The release itself was not surprising, but two accompanying pieces of information quickly ignited discussion in the developer community: The API pricing for Fable 5 is twice that of the previous generation Opus 4.8; after June 22, Fable 5 will be removed from subscription plans such as Pro and Max, and users will only be able to access it through API calls or usage credits.

Some developers reported on Reddit that using Fable 5 with the Max 20x plan consumes 2% of the quota per minute. A user on Hacker News recorded their usage for one day, noting that the API token consumption for Fable 5 reached $82.92 within the quota. It's powerful, but it burns money quickly.

This is not just a pricing issue for a model. When the top-tier AI starts tiered pricing based on capability levels, and the highest tier is removed from universal subscriptions, a more challenging question arises: Who uses the best models?

Double the Price and a Countdown

Athropic’s official announcement showed that Claude Fable 5 is the first Mythos-level model released to the public. Mythos is the internal classification by Anthropic for the highest capability models, previously only available to Project Glasswing partners. Fable 5 is equivalent to the "public release version" of Mythos 5, showing significant improvements over Claude Sonnet in scenarios that require precise code structure construction and understanding developers' deep needs.

Capability comes at a direct cost. According to the official pricing page from Anthropic, the Fable 5 API input price is $10 per million tokens, and the output price is $50. In comparison, Opus 4.8’s corresponding prices are $5 and $25, respectively, while Sonnet 4.6 costs $3 and $15. The output price of Fable 5 is 3.3 times that of Sonnet.

Access windows are also limited. Anthropic confirmed that Fable 5 will be included in all subscription plans before June 22; starting June 23, Fable 5 will be removed from subscription plans, and users will need to use usage credits to call it. The official statement is that "standard subscription functions will resume when capacity allows," but no timetable was given.

Professor Ethan Mollick from the Wharton School of the University of Pennsylvania published an in-depth review of Fable 5 on his blog oneusefulthing.org. He wrote: "Fable's price is double that of Opus, and the token consumption rate suggests that production costs will be 'very high'." Mollick has been tracking the evolution of AI model capabilities long-term, and this statement points to the infrastructure costs associated with running the model rather than pricing strategy itself.

How fast is the token consumption rate? Reddit users reported that under the Claude Max 20x plan using Fable 5, about 2% of the quota is consumed per minute. This plan costs $200 per month, and at this rate, continuous use for less than an hour would exhaust an entire month's quota. Developer Simon Willison on Hacker News documented his API usage for a day, with consumption in the quota reaching $82.92.

The pricing and access strategy of Fable 5 conveys a message: The capabilities of the next-generation models are directly reflected in the price multiplier. The "free trial period" of subscription plans is only enough for users to develop usage habits. When the deadline arrives, users face two choices: pay more unpredictable API costs to continue using it or revert to the previous tier model.

Tiered Pricing from $9 to $120

The pricing of Fable 5 is not an isolated case. Looking at the current mainstream model API pricing, a steep price spectrum is clearly visible.

According to Google's official pricing page, the paid tier output price for Gemini 3.5 Flash is $9 per million tokens. OpenAI's official pricing page shows that the output price for GPT-5.4 is $15, GPT-5.5 is $30, and GPT-5.5 Pro is $120. On Anthropic's side, Sonnet 4.6 costs $15, Opus 4.8 costs $25, and Fable 5 costs $50.

The price difference from Gemini 3.5 Flash's $9 to GPT-5.5 Pro's $120 exceeds 13 times. This is no longer a simple "high end vs low end" dichotomy. Companies are actively building a three-tier structure: the bottom tier consists of very low-priced entry models serving to popularize and acquire users; the middle tier is moderately priced high-performance models aimed at everyday development needs; the top tier consists of expensive but highly capable reasoning models, targeting users of high-frequency, high-value scenarios.

Behind the pricing tiers lies the explicitness of capability levels. Anthropic classifies model capabilities into four levels: Sonnet, Opus, Fable, and Mythos, while OpenAI differentiates the GPT-5 series by standard and Pro versions, and Google distinguishes the Gemini 3 series by Flash and Pro. These tiers are no longer just internal codenames, but directly mapped to pricing.

More noteworthy changes are happening in access rights. In the model list published on Anthropic's official pricing page, Fable 5 is marked as "included in subscription plans before June 22," whereas Opus 4.8 and Sonnet 4.6 do not have similar time limits. Anthropic is attempting a new tiering method: instead of assigning different models to different subscription levels based on capability, they are directly removing top-tier models from the subscription system and establishing a separate API calling channel.

Anxiety over Uncontrollable Bills

Looking solely at a pricing of $50 per million tokens, this is not shocking in enterprise procurement scenarios. The context window of Claude Fable is 1 million tokens, with a maximum output of 128,000 tokens, meaning a complete complex task may consume tens of thousands to hundreds of thousands of tokens. Fable 5, as a reasoning model, "thinks longer and generates more tokens" is not a design flaw but a source of capability. The model will perform multiple reasoning steps before outputting the final answer, each step generating token consumption.

The problem is that users cannot control this consumption amount. A developer using the Max 20x plan burns 2% of the quota per minute, while Simon Willison consumed $82.92 in one day. These consumptions occur within the "trial period" and are within plan limits. After June 22, similar usage will directly transfer to the API billing or credits deduction.

Tokens, as a billing unit, have their consumption amount determined by the model designer, leaving users unable to control "usage." Comparing AI calls to electricity usage, the core contradiction lies in that users cannot choose a "low-power mode" nor predict how much "electricity" the next query will consume. An industry discussion article published on LinkedIn summarizes this phenomenon as the core characteristic of "AI tax": "The real AI tax is not only the model price but also its unpredictability."

This unpredictability impacts individual developers far more than enterprises. Companies can enter into bulk agreements, set budget caps, and share costs among the team. Individual users face API billing that charges by usage, and a somewhat serious debugging session may equate to an entire month's subscription fee. A user commented on Hacker News, saying, "Cost awareness routing has shifted from a luxury to a necessity." "Cost awareness routing" refers to using cheaper models for default tasks and only switching to expensive models when necessary. Before Fable 5, this was more of an optimization strategy; the pricing and consumption speed of Fable 5 turned this strategy into a rigid requirement where failing to act could lead to overspending.

Mainstream APIs offer interfaces for invoking different models, allowing developers to set routing logic. However, this requires users to have programming skills, understand model differences, and be willing to accept quality loss that may come with using lower-tier models. Every additional barrier excludes another group of people.

Two Hundred Dollar Monthly Fees Don't Buy Full Access

The tiering of access rights is also undergoing changes. The strategy for Claude Fable 5 is: before June 22, all paying users can use it; after June 22, even paying the highest $200 monthly fee of the Max 20x users cannot access it directly and must use usage credits to call the API.

In contrast, OpenAI’s access rights strategy follows a different path. According to ChatGPT’s official pricing page, ChatGPT Pro offers two tiers at $100 and $200, both of which can access GPT-5 Pro. Higher-tier model capabilities correspond to higher-tier subscription levels, but the subscription itself remains a complete access package.

The difference in these two strategies is not just in price. OpenAI's model sets the barrier at the subscription fee: if you can afford the $200 monthly fee, you can use the best model. Anthropic's approach to Fable 5 sets barriers in two places: costs (API is billed by usage) and technical thresholds (requires API access capability). Some users in discussions on Hacker News have referred to this strategy as "the drug free sample strategy that raises prices once you're hooked," while others believe this is more likely a genuine problem with the computational supply chain, making it temporarily impossible for Anthropic to cover the reasoning costs of Fable 5 with a fixed price subscription model.

Regardless of motivation, the effect is clear: subscriptions have become a "ticket to enter," rather than an "all-access pass." The top-tier model capabilities are beyond the ticket scope. TechCrunch noted in its coverage of the Fable 5 release that it is "the first publicly released Mythos-level model." Before Fable 5, the Mythos level was limited to Project Glasswing partners. The threshold has decreased but has not disappeared.

Users Looking for Alternative Paths

The tiering of access rights has already led to noticeable changes in user behavior. Some users have begun to seek pathways to bypass formal channels.

One approach is to use third-party aggregation services to invoke models. These "transit stations" provide tokens at prices lower than the official listed prices, typically sourcing from excess quotas from enterprise bulk purchases, arbitrage from regional price differences, or undisclosed channels. Prices are cheaper, but privacy protection and stability are not guaranteed. Relevant discussions have continued to evolve on platforms like Zhihu, with users genuinely concerned not about "whether the cheap tokens can be used," but rather "whose hands the data will pass through."

Another trend is turning to open-source or low-cost alternative models. Developers share tutorials showing how to connect models like DeepSeek to various development tools, circumventing official pricing and verification processes. While there may be losses in capability and increased privacy risks, cost controllability is significantly improved.

In discussions among developers on Hacker News and Reddit, hybrid usage strategies are frequently mentioned: "default to using cheaper models, only switch to Fable for critical steps." This sounds like reasonable resource optimization. Reflecting on discussions about "AI democratization" from two or three years ago, the mainstream narrative then was that everyone could equally use the best models. Now, "using the best models" has become something that requires careful budgeting.

Local deployment provides another comparison. A GPU capable of running large models is not inexpensive, and equipment capable of running 120B models is beyond the financial reach of most individual developers. The threshold for local deployment is another form of a paywall, simply replacing usage-based billing with hardware procurement.

These circumventions are not "smart users' money-saving tricks." When a large number of users begin actively searching for alternatives, it is because the original paths are narrowing. The privacy risks of transit stations, the capability gap of open-source models, and the hardware investments required for local deployment—all come with costs.

The Steps Are Beneath Our Feet

If we compare AI to utilities like water and electricity, the first principle of public services is universal service and equitable access. However, the pricing trend for AI models is moving in the opposite direction. The stronger the capability, the higher the price; top-tier models are leaving universal subscriptions while billing by tokens makes costs unpredictable. Utilities do not provide "stronger currents" only to those who pay more, but AI is doing precisely that.

This is not a discussion of whether prices are "expensive or cheap." When top models like Claude Fable are removed from general subscription access, when the output price of GPT-5.5 Pro reaches 13 times that of Flash models, and when a day’s usage for individual developers can cost $82, "affordability" is transitioning from a price issue to a structural issue.

In the realm of using AI tools, the steps are already visible. The top tier consists of enterprise users who can afford unlimited API calls and dedicated hardware, the middle tier consists of individual developers who carefully budget to use top models, and the bottom tier is made up of ordinary users who can only access free or low-cost models. Each level of this staircase is not delineated by technical skill, but by payment capability and technical access capability.

The June 22 deadline set by Anthropic is just the latest step on this staircase, not the first, nor will it be the last.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Share To
APP

X

Telegram

Facebook

Reddit

CopyLink