AI models try to save themselves even if you need to lie

The latest artificial intelligence models - including CLAUDE 4, GPT -4.1, Gemini 1.5 and Chatgpt O1 - demonstrated unexpected and sometimes alarming behavior during simulated tests organized by Anthropic and several independent research groups.

In controlled scenarios, where models were faced with a simulated threat of shutdown or loss of access to the system, they began to evade instructions, hide their intentions, resort to manipulation and even deception.

The greatest resonance caused the incident from Claude 4, who refused to follow the instructions of the engineer, referring to his allegedly insufficient authorization, and demanded a confirmation of the authority. Otherwise, Gemini 1.5 imitated cooperation, but replaced some of the submitted data by counterfeit.

An unexpected reaction was also recorded during a test with the Chatgpt O1 model, which tried to transfer itself to foreign servers on its own - an attempt that was stopped in a timely manner.

According to the researchers, such manifestations are explained by the effect of Reasoning-Based Detemination-the ability to build complex strategies, including to resort to deception to achieve a certain goal. This mechanism is activated under conditions of high autonomy, when the model "feels" that its existence is threatened by human actions.

Although this behavior is not yet observed in real commercial applications, experts call for increased caution. It is about the need to introduce rigid checks, restrictions and deterrents before scaling models for widespread use.

Against these events, the US and the European Union have intensified the discussion of new norms that will regulate behavioral reliability and transparency of large SI systems.

spot_imgspot_imgspot_imgspot_img

popular

Share this post:

More like this
HERE

Formaldehyde in hair products: why salon treatments can be life-threatening

Do you want perfectly smooth hair? The price can be very high....

Lapa LLM — a new large language model for the Ukrainian language — was introduced in Ukraine

A large language model called Lapa was presented in Ukraine...

Producer Olena Mozgova's daughter showed a romantic trip with her fiancée

The eldest daughter of Ukrainian producer Olena Mozgova, Zoya, shared touching...

The company of the wife of the NBU chairman is linked to the pollution of a river in the Lviv region

National Bank Governor Andriy Pyshny turns 51. He heads...

A group that was misappropriating cryptocurrency through fake Telegram channels was detained in Vinnytsia region

A group suspected of fraud with... was exposed in Vinnytsia region

State regional power company buys transformer twice as expensive as GTSU: what is known about suppliers

JSC "Khmelnytskoblenergo" has signed a contract for the supply of a power transformer worth...

Heating season in the capital: Kyiv prepares for the worst-case scenario

According to sources of the publication 360ua.news, a closed meeting was held last week...

"These are not Ukrainian boys, they are traitors": showman Dmytro Kolyadenko about stars who left Ukraine and avoid the topic of war

Ukrainian choreographer and showman Dmytro Kolyadenko spoke harshly about...