x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
AI Auditing — LessWrong
AI Auditing
Edited by
Raemon
last updated
4th Aug 2025
Formerly "auditing games"
Subscribe
Discussion
Subscribe
Discussion
Posts tagged
AI Auditing
Most Relevant
10
89
Automating Auditing: An ambitious concrete technical research proposal
Ω
evhub
4y
Ω
13
4
163
A transparency and interpretability tech tree
Ω
evhub
3y
Ω
11
2
142
Auditing language models for hidden objectives
Ω
Sam Marks
,
Johannes Treutlein
,
dmz
,
Sam Bowman
,
Hoagy
,
Carson Denison
,
Kei Nishimura-Gasparian
,
7vik
,
Akbir Khan
,
Austin Meek
,
Euan Ong
,
Christopher Olah
,
Fabien Roger
,
jeanne_
,
Meg
,
Drake Thomas
,
Adam Jermyn
,
Monte M
,
evhub
9mo
Ω
15
2
127
Towards Alignment Auditing as a Numbers-Go-Up Science
Ω
Sam Marks
4mo
Ω
15
2
55
Putting up Bumpers
Ω
Sam Bowman
7mo
Ω
14
2
38
What progress have we made on automated auditing?
Q
Ω
LawrenceC
1y
Q
Ω
1
2
33
Auditing games for high-level interpretability
Ω
Paul Colognese
3y
Ω
1
1
22
Hidden Cognition Detection Methods and Benchmarks
Paul Colognese
2y
11