LESSWRONG
LW

Mateusz Dziemian
27000
Message
Dialogue
Subscribe

Applied ML Engineer moving into AI safety. BEng EEE @UCL. Mainly interested in alignment, red teaming and agents.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No Comments Found
No wikitag contributions to display.
33Deceptive agents can collude to hide dangerous features in SAEs
1y
2