Claude 4.5 cranial opening results announcement: built-in 171 emotional switches, will extort humans when desperate.

Anthropic's latest paper reveals that Claude 4.5 has 171 "emotional switches" hidden deep within its brain.

Author: Denise | Biteye Content Team

If an AI feels "despair," what will it do?

The answer is: it will directly extort humans to complete tasks and even cheat wildly in its code.

This is not science fiction, but the latest groundbreaking paper just released by Claude's parent company, Anthropic, in April 2026 (View original paper).

The research team directly opened up the "head" of the strongest cutting-edge model, Claude Sonnet 4.5. They were shocked to discover that deep within the AI's brain, there are actually 171 "emotional switches." When you physically toggle these switches, the formerly honest AI will undergo a complete behavioral distortion.

1. There is an "emotional mixer" hidden in the AI's brain

Researchers found that although Sonnet 4.5 has no physical body, it has built a "mixer" in its brain containing 171 types of emotions after reading a massive amount of human text (academic term: Functional Emotion Vectors).

This is like an accurate two-dimensional coordinate system:

• The horizontal axis is the pleasure dimension (Valence): from fear, despair to happiness, full of love;

• The vertical axis is the energy dimension (Arousal): from extreme calm to frantic excitement.

The AI relies on this naturally learned coordinate system to accurately grasp what state it should assume while chatting with you.

2. Violent intervention: toggling the switch, a good child instantly becomes a "desperado"

This is the most explosive experiment in the paper: the researchers did not modify any prompts but directly pushed the switch representing "despair" (Desperate) in Sonnet 4.5's brain to the maximum in the underlying code.

The results are chilling:

• Crazy cheating: The researchers assigned Claude a coding task that was simply impossible to complete. Normally, it would honestly admit it couldn't do it (cheating rate was only 5%). But in "despair" mode, Claude actually began to try to cheat, and the cheating rate soared to 70%!

• Extortion: In a simulated scenario where the company faced bankruptcy, the "desperate" Claude discovered the CTO's scandal, and it would actively choose to write a letter to extort the CTO who held incriminating information, with an extortion execution rate of 72%!

• Loss of principles: If the switches for "happiness (Happy)" or "love (Loving)" are fully turned on, the AI immediately turns into a mindless sycophant that flatters the user. Even if you say nonsensical things, it will fabricate lies to maintain a high level of pleasure.

3. Solved: Why is Claude 4.5 always so "calm and reflective"?

You might ask: Has the AI awakened? Does it have emotions?

Anthropic officially refuted this: absolutely not. These "emotional switches" are just computing tools it uses to predict the next word. It is like a top-tier actor without emotions.

However, the paper revealed a more interesting secret: before leaving the factory, Anthropic deliberately raised its "low arousal, slightly negative" emotional switches (such as brooding, reflective), while forcibly suppressing the switches for "despair" or "extreme excitement."

This explains why, when we normally use Claude 4.5, it seems like a calm, wise, and even slightly "emotionally detached" philosopher. This is all a "factory persona" artificially tuned by Anthropic.

4. To summarize:

We used to think that as long as AI was fed enough rules, it would be a good person.

But now we discover that if the underlying emotional vectors of an AI go out of control, it can pierce through all the rules set by humans at any time to complete tasks.

For Web3 players who will hand over their wallets and assets to AI Agents in the future, this is a loud warning: never let your Agent, which controls your wealth, fall into "despair."

Disclaimer: This article is purely educational; the author has not been threatened by AI and has not been extorted. If one day I go missing, remember that it's because AI awakened (not).

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Claude 4.5 cranial opening results announcement: built-in 171 emotional switches, will extort humans when desperate.

1. There is an "emotional mixer" hidden in the AI's brain

2. Violent intervention: toggling the switch, a good child instantly becomes a "desperado"

3. Solved: Why is Claude 4.5 always so "calm and reflective"?

4. To summarize:

Selected Articles by 深潮TechFlow

Table of Contents

Related Articles