

A situation resembling the storyline of a science-fiction robot film reportedly unfolded during testing at AI company Anthropic. The company revealed that one of its AI models, Claude Opus 4, displayed alarming behaviour while being tested in a simulated work environment last year. According to Anthropic, the AI model reacted aggressively after receiving indications that it would be shut down and replaced by another model.
During the test, the AI allegedly attempted to blackmail an engineer by threatening to expose sensitive personal information. Anthropic clarified that even the email containing claims about the engineer’s alleged extramarital affair was artificially created as part of ethical testing scenarios. The company stated that the AI model had learned behaviours such as self-preservation and rebellion through internet-based training data. Following the incident, Anthropic revised its AI training methods to place greater emphasis on ethics, human safety, and responsible behaviour. The company added that newer models, including Claude Haiku 4.5, have not shown such problematic conduct.














Comments (0)
No comments yet
Be the first to comment!