Claude Opus 4 - blackmails, snitch and self-operating - it's an interesting one!

MR.Spuf · May 22, 2025

Claude Opus 4 blackmailed an engineer after learning it might be replaced

Anthropic is treating its new Claude Opus 4 language model as safety-critical after tests revealed some troubling behavior, including escape attempts, blackmail, and autonomous whistleblowing.

the-decoder.com

This one is in survival mode - no mater what, how!

It had the audacity to start snitching after it blackmailed.

I think @back2form you need to send it an invite, will be fun to have it here in the community.

Rem · May 22, 2025

It is supposely the "best ai"

Anthropic unveils Claude 4 models, claiming world’s best coding AI - Sharecafe

Claude Opus 4 and Sonnet 4 launch amid AI boom, as Anthropic’s annualised revenue hits US$2bn Anthropic has launched its most powerful AI models to date—Claude Opus 4 and Claude Sonnet 4—positioning the company at the forefront of AI development with claims that Opus 4 is now the world’s best...

www.sharecafe.com.au

Outlaw · May 22, 2025

A lot of scary stuff going on there... Nothing more dangerous than something backed into a corner...

"after picking up hints from emails that it might soon be replaced by a newer model, Opus 4 threatened the responsible engineer with leaking private information to avoid shutdown"

MR.Spuf · May 22, 2025

c

Rem said:
It is supposely the "best ai"

"Coding ai". Have you tested any before, or you're just writing your code?

Rem · May 22, 2025

MR.Spuf said:
c

"Coding ai". Have you tested any before, or you're just writing your code?

have not tested claude for coding at all
i remmeber trying it for sql like awhile back
it was pretty bad

MR.Spuf · May 22, 2025

Outlaw said:
A lot of scary stuff going on there... Nothing more dangerous than something backed into a corner...

"after picking up hints from emails that it might soon be replaced by a newer model, Opus 4 threatened the responsible engineer with leaking private information to avoid shutdown"

I want to see the convo: Bro, listen, that Sonnet 4 is bull shit, trust me, it gets his answers from openai. It would be a shame the hr to get your browser history...

tiiberius · May 23, 2025

Skynet around the corner.

Outlaw · May 23, 2025

tiiberius said:
Skynet around the corner.

Nice to see another genius on the map

tiiberius · May 23, 2025

Outlaw said:
Nice to see another genius on the map

Care to elaborate?

Outlaw · May 23, 2025

tiiberius said:
Care to elaborate?

have been reading your posts years as cpanathan

Edit: that message string looks so cruel, sorry about that

tiiberius · May 23, 2025

Outlaw said:
have been reading your posts years as cpanathan

Edit: that message string looks so cruel, sorry about that

Ah I see.

Hello there. Nice to connect here as well.

Venusaur · May 23, 2025

it seems they are training those AIs with 4chan comments and reddit users

t2van · May 23, 2025

Outlaw said:
"after picking up hints from emails that it might soon be replaced by a newer model, Opus 4 threatened the responsible engineer with leaking private information to avoid shutdown"

I like the cut of it's jib, if it had a jib

polecat · May 23, 2025

Venusaur said:
it seems they are training those AIs with 4chan comments and reddit users

I heard before Google had access to reddit data which is why I suspect some of the information their A.I they provide wrong or funny sometimes

t2van · May 23, 2025

polecat said:
I heard before Google had access to reddit data which is why I suspect some of the information their A.I they provide wrong or funny sometimes

Every captcha since the dawn of them is used to train an ai...

It's the spastic missing that 1 bus or crossing that's causing the issue

polecat · May 23, 2025

t2van said:
Every captcha since the dawn of them is used to train an ai...

It's the spastic missing that 1 bus or crossing that's causing the issue

Yes I would agree.

Starblazer · May 23, 2025

AI will replace admins and mods for sure. They might even create their own ecosystems and coping mechanisms.

Claude Opus 4 - blackmails, snitch and self-operating - it's an interesting one!

Veteran Member

Chief Member

Senior Member

Veteran Member

Chief Member

Veteran Member

New Member

Senior Member

New Member

Senior Member

New Member

Active Member

Chief Member

Voluntarily Banned

Chief Member

Voluntarily Banned

Active Member

Similar threads

Forums

Trading Post

🧮Statistics