AI models try to save themselves, even if you need to lie

The latest artificial intelligence models - including CLAUDE 4, GPT -4.1, Gemini 1.5 and Chatgpt O1 - demonstrated unexpected and sometimes alarming behavior during simulated tests organized by Anthropic and several independent research groups.

In controlled scenarios, where models were faced with a simulated threat of shutdown or loss of access to the system, they began to evade instructions, hide their intentions, resort to manipulation and even deception.

The greatest resonance caused the incident from Claude 4, who refused to follow the instructions of the engineer, referring to his allegedly insufficient authorization, and demanded a confirmation of the authority. Otherwise, Gemini 1.5 imitated cooperation, but replaced some of the submitted data by counterfeit.

An unexpected reaction was also recorded during a test with the Chatgpt O1 model, which tried to transfer itself to foreign servers on its own - an attempt that was stopped in a timely manner.

According to the researchers, such manifestations are explained by the effect of Reasoning-Based Detemination-the ability to build complex strategies, including to resort to deception to achieve a certain goal. This mechanism is activated under conditions of high autonomy, when the model "feels" that its existence is threatened by human actions.

Although this behavior is not yet observed in real commercial applications, experts call for increased caution. It is about the need to introduce rigid checks, restrictions and deterrents before scaling models for widespread use.

Against these events, the US and the European Union have intensified the discussion of new norms that will regulate behavioral reliability and transparency of large SI systems.

AI models try to save themselves even if you need to lie

Timur Tkachenko redirects Kyiv's budget flows through mobilization reservations

NACP revealed violations in the declaration of former Justice Ministry official Legostayev amounting to over 2 million hryvnias

Instead of fighting crime, law enforcement agencies are "covering up" OnlyFans and fake call centers

How businessman Maksym Krippa and the company "GGBet" withdraw millions from Ukraine through offshore accounts

Kyiv prosecutor Serhiy Khodakivskyi found himself at the center of a scandal over a business takeover

$600,000 in the shadows: what volunteer Sternenko's crypto wallets hide

Supreme Court judge receives Russian passport and apartment from Russian Defense Ministry

In the front-line Hlukhiv, the school cafeteria is being renovated, where children have not studied for a long time

More like this
HERE

Deputy Head of the State Security Service of the Ministry of Internal Affairs Onufrey hid the cost of renting an apartment in Kyiv

Timur Tkachenko redirects Kyiv's budget flows through mobilization reservations

Archaeologists found Cossack artifacts near Nekhvoroshchansky Monastery

NACP revealed violations in the declaration of former Justice Ministry official Legostayev amounting to over 2 million hryvnias

In Vinnytsia region, officials were exposed in schemes involving land and budget funds

Singer Loboda in Batumi reprimanded fans for using Russian and outraged the Internet

Instead of fighting crime, law enforcement agencies are "covering up" OnlyFans and fake call centers

Combining psychiatry and immunology helps predict suicide risk

AI models try to save themselves even if you need to lie

More like thisHERE

More like this
HERE