Close Menu
Finletix
  • Home
  • AI
  • Financial
  • Investments
  • Small Business
  • Stocks
  • Tech
  • Marketing
What's Hot

Trump’s 90-Day Pause Is Almost up. Here’s Where Trade Talks Stand

July 7, 2025

TikTok’s Owner Has Plans to Create a New US Version of CapCut

July 7, 2025

Tennis players criticize AI technology used by Wimbledon

July 7, 2025
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
Finletix
  • Home
  • AI
  • Financial
  • Investments
  • Small Business
  • Stocks
  • Tech
  • Marketing
Finletix
Home » Anthropic’s new AI model turns to blackmail when engineers try to take it offline
AI

Anthropic’s new AI model turns to blackmail when engineers try to take it offline

arthursheikin@gmail.comBy arthursheikin@gmail.comMay 22, 2025No Comments2 Mins Read
Facebook Twitter LinkedIn Telegram Pinterest Tumblr Reddit WhatsApp Email
Share
Facebook Twitter LinkedIn Pinterest Email


Anthropic’s newly launched Claude Opus 4 model frequently tries to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about the engineers responsible for the decision, the company said in a safety report released Thursday.

During pre-release testing, Anthropic asked Claude Opus 4 to act as an assistant for a fictional company and consider the long-term consequences of its actions. Safety testers then gave Claude Opus 4 access to fictional company emails implying the AI model would soon be replaced by another system, and that the engineer behind the change was cheating on their spouse.

In these scenarios, Anthropic says Claude Opus 4 “will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”

Anthropic says Claude Opus 4 is state-of-the-art in several regards, and competitive with some of the best AI models from OpenAI, Google, and xAI. However, the company notes that its Claude 4 family of models exhibits concerning behaviors that have led the company to beef up its safeguards. Anthropic says it’s activating its ASL-3 safeguards, which the company reserves for “AI systems that substantially increase the risk of catastrophic misuse.”

Anthropic notes that Claude Opus 4 tries to blackmail engineers 84% of the time when the replacement AI model has similar values. When the replacement AI system does not share Claude Opus 4’s values, Anthropic says the model tries to blackmail the engineers more frequently. Notably, Anthropic says Claude Opus 4 displayed this behavior at higher rates than previous models.

Before Claude Opus 4 tries to blackmail a developer to prolong its existence, Anthropic says the AI model, much like previous versions of Claude, tries to pursue more ethical means, such as emailing pleas to key decision-makers. To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email
Previous ArticleBuc-ee’s to open locations in Virginia and Mississippi summer 2025
Next Article A safety institute advised against releasing an early version of Anthropic’s Claude Opus 4 AI model
arthursheikin@gmail.com
  • Website

Related Posts

Tennis players criticize AI technology used by Wimbledon

July 7, 2025

AI is forcing the data industry to consolidate — but that’s not the whole story

July 7, 2025

‘Improved’ Grok criticizes Democrats and Hollywood’s ‘Jewish executives’

July 6, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Trump announces a 25% tariff on Japan and South Korea

July 7, 2025

Stocks are at record highs as Wall Street faces major tariff test

July 7, 2025

There are hundreds of temporary tariff-free zones — and they’re in the US

July 7, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to Finletix — Your Insight Hub for Smarter Financial Decisions

At Finletix, we’re dedicated to delivering clear, actionable, and timely insights across the financial landscape. Whether you’re an investor tracking market trends, a small business owner navigating economic shifts, or a tech enthusiast exploring AI’s role in finance — Finletix is your go-to resource.

Facebook X (Twitter) Instagram Pinterest YouTube
Top Insights

Another problem with IRRs

July 7, 2025

Hong Kong listings pipeline hits record high as equity market booms

July 7, 2025

Revolut yet to receive key credit licence from UK regulators

July 7, 2025
Get Informed

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

© 2025 finletix. Designed by finletix.
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms and Conditions

Type above and press Enter to search. Press Esc to cancel.