This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
18
Wikitags
METR (org)
Edited by
Ruby
last updated
1st Jul 2024
Formerly ARC Evals
Subscribe
Discussion
Subscribe
Discussion
Posts tagged
METR (org)
Most Relevant
99
METR's Observations of Reward Hacking in Recent Frontier Models
Daniel Kokotajlo
3mo
9
10
Review of METR’s public evaluation protocol
nahoj
,
JaimeRV
1y
0
241
METR: Measuring AI Ability to Complete Long Tasks
Ω
Zach Stein-Perlman
5mo
Ω
106
153
ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks
Ω
Beth Barnes
2y
Ω
12
139
METR's Evaluation of GPT-5
Ω
GradientDissenter
1mo
Ω
15
108
Clarifying METR's Auditing Role
Ω
Beth Barnes
1y
Ω
1
90
Introducing METR's Autonomy Evaluation Resources
Megan Kinniment
,
Beth Barnes
2y
0
68
Interpreting the METR Time Horizons Post
Ω
snewman
5mo
Ω
12
65
METR is hiring!
Beth Barnes
2y
1
64
CoT May Be Highly Informative Despite “Unfaithfulness” [METR]
Ω
GradientDissenter
1mo
Ω
3
59
Reactions to METR task length paper are insane
Cole Wyeth
5mo
43
40
ARC Evals: Responsible Scaling Policies
Zach Stein-Perlman
2y
10
20
Improved visualizations of METR Time Horizons paper.
LDJ
6mo
4
20
How far along Metr's law can AI start automating or helping with alignment research?
Q
Christopher King
6mo
Q
21
16
METR: AI models can be dangerous before public deployment
UnofficialLinkpostBot
7mo
0