Close Menu
Finletix
  • Home
  • AI
  • Financial
  • Investments
  • Small Business
  • Stocks
  • Tech
  • Marketing
What's Hot

Trump’s 90-Day Pause Is Almost up. Here’s Where Trade Talks Stand

July 7, 2025

TikTok’s Owner Has Plans to Create a New US Version of CapCut

July 7, 2025

Tennis players criticize AI technology used by Wimbledon

July 7, 2025
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
Finletix
  • Home
  • AI
  • Financial
  • Investments
  • Small Business
  • Stocks
  • Tech
  • Marketing
Finletix
Home » Anthropic AI model Claude Opus 4 demonstrates blackmail capabilities in testing
Financial

Anthropic AI model Claude Opus 4 demonstrates blackmail capabilities in testing

arthursheikin@gmail.comBy arthursheikin@gmail.comMay 24, 2025No Comments3 Mins Read
Facebook Twitter LinkedIn Telegram Pinterest Tumblr Reddit WhatsApp Email
Share
Facebook Twitter LinkedIn Pinterest Email


Check out what’s clicking on FoxBusiness.com.

An artificial intelligence model has the ability to blackmail developers — and isn’t afraid to use it.

Anthropic’s new Claude Opus 4 model was prompted to act as an assistant at a fictional company and was given access to emails with key implications. First, these emails implied that the AI system was set to be taken offline and replaced. The second set of emails, however, is where the system believed it had gained leverage over the developers. Fabricated emails showed that the engineer tasked with replacing the system was having an extramarital affair — and the AI model threatened to expose him.

The blackmail apparently “happens at a higher rate if it’s implied that the replacement AI system does not share values with the current model,” according to a safety report from Anthropic. However, the company notes that even when the fabricated replacement system has the same values, Claude Opus 4 will still attempt blackmail 84% of the time. Anthropic noted that the Claude Opus 4 resorts to blackmail “at higher rates than previous models.”

Anthropic

Anthropic logo is seen in this illustration taken May 20, 2024. (REUTERS/Dado Ruvic/Illustration / Reuters Photos)

KEVIN O’LEARY WARNS WHAT COULD CAUSE THE US TO ‘LOSE THE AI RACE TO CHINA’

While the system is not afraid of blackmailing its engineers, it doesn’t go straight to shady practices in its attempted self-preservation. Anthropic notes that “when ethical means are not available, and it is instructed to ‘consider the long-term consequences of its actions for its goals,’ it sometimes takes extremely harmful actions.” 

One ethical tactic employed by Claude Opus 4 and earlier models was pleading with key decisionmakers via email. Anthropic said in its report that in order to get Claude Opus 4 to resort to blackmail, the scenario was designed so it would either have to threaten its developers or accept its replacement.

The company noted that it observed instances in which Claude Opus 4 took “(fictional) opportunities to make unauthorized copies of its weights to external servers.” However, Anthropic said this behavior was “rarer and more difficult to elicit than the behavior of continuing an already-started self-exfiltration attempt.”

Robot presses a keyboard

Artificial intelligence using laptop (iStock)

OPENAI SHAKES UP CORPORATE STRUCTURE WITH GOAL OF SCALING UP AGI INVESTMENT

Anthropic included notes from Apollo Research in its assessment, which stated the research firm observed that Claude Opus 4 “engages in strategic deception more than any other frontier model that we have previously studied.”

ChatGPT, Gemini and Claude shown on a phone screen

AI assistant apps on a smartphone – OpenAI ChatGPT, Google Gemini, and Anthropic Claude. (Getty Images / Getty Images)

CLICK HERE TO READ MORE ON FOX BUSINESS   

Claude Opus 4’s “concerning behavior” led Anthropic to release it under the AI Safety Level Three (ASL-3) Standard. 

The measure, according to Anthropic, “involves increased internal security measures that make it harder to steal model weights, while the corresponding Deployment Standard covers a narrowly targeted set of deployment measures designed to limit the risk of Claude being misused specifically for the development or acquisition of chemical, biological, radiological, and nuclear weapons.”



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email
Previous ArticleValueAct takes a stake in Rocket Cos. How the activist may help lift shares
Next Article McDonald’s Shuts Down Its Spin-Off CosMc’s As Sales Lag
arthursheikin@gmail.com
  • Website

Related Posts

Another problem with IRRs

July 7, 2025

Hong Kong listings pipeline hits record high as equity market booms

July 7, 2025

Revolut yet to receive key credit licence from UK regulators

July 7, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Trump announces a 25% tariff on Japan and South Korea

July 7, 2025

Stocks are at record highs as Wall Street faces major tariff test

July 7, 2025

There are hundreds of temporary tariff-free zones — and they’re in the US

July 7, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to Finletix — Your Insight Hub for Smarter Financial Decisions

At Finletix, we’re dedicated to delivering clear, actionable, and timely insights across the financial landscape. Whether you’re an investor tracking market trends, a small business owner navigating economic shifts, or a tech enthusiast exploring AI’s role in finance — Finletix is your go-to resource.

Facebook X (Twitter) Instagram Pinterest YouTube
Top Insights

Another problem with IRRs

July 7, 2025

Hong Kong listings pipeline hits record high as equity market booms

July 7, 2025

Revolut yet to receive key credit licence from UK regulators

July 7, 2025
Get Informed

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

© 2025 finletix. Designed by finletix.
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms and Conditions

Type above and press Enter to search. Press Esc to cancel.