LESSWRONG
Community
LW

AI
Event

2

AI Safety Thursdays: Agentic Misalignment: How LLMs could be insider threats

by Juliana Eberschlag, Mario Gibney
1 min read
0

2

Thursday 24th July at 10:00 pm GMT
Toronto, ON, Canada

Posted on: 30th Jun 2025

Subscribe to group

2

New Comment
Moderation Log
More from Juliana Eberschlag
View more
Curated and popular this week
0Comments
Trajectory Labs (Toronto AI Safety)

Description

Can AI agents misbehave while carrying out actions autonomously? At this event, Giles Edkins will guide us through a look at and critique some research by Anthropic that demonstrates blackmail and other phenomena when an agent is threatened with shutdown or reprogramming.

​​​Event Schedule
6:00 to 6:30 - Food & Networking
6:30 to 7:30 - Main Presentation & Questions
7:30 to 8:00 - Discussion

If you can't make it in person, feel free to join the live stream at 6:30 pm, via this link.