This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
999
Wikitags
METR (org)
Edited by
Ruby
last updated
1st Jul 2024
Formerly ARC Evals
Subscribe
Discussion
Subscribe
Discussion
Posts tagged
METR (org)
Most Relevant
99
METR's Observations of Reward Hacking in Recent Frontier Models
Daniel Kokotajlo
5mo
9
97
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
habryka
4mo
43
10
Review of METR’s public evaluation protocol
nahoj
,
JaimeRV
1y
0
242
METR: Measuring AI Ability to Complete Long Tasks
Ω
Zach Stein-Perlman
7mo
Ω
106
153
ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks
Ω
Beth Barnes
2y
Ω
12
141
METR's Evaluation of GPT-5
Ω
GradientDissenter
3mo
Ω
15
108
Clarifying METR's Auditing Role
Ω
Beth Barnes
1y
Ω
1
90
Introducing METR's Autonomy Evaluation Resources
Megan Kinniment
,
Beth Barnes
2y
0
70
Interpreting the METR Time Horizons Post
Ω
snewman
6mo
Ω
12
65
METR is hiring!
Beth Barnes
2y
1
64
CoT May Be Highly Informative Despite “Unfaithfulness” [METR]
Ω
GradientDissenter
3mo
Ω
3
59
Reactions to METR task length paper are insane
Cole Wyeth
7mo
43
40
ARC Evals: Responsible Scaling Policies
Zach Stein-Perlman
2y
10
26
Improved visualizations of METR Time Horizons paper.
LDJ
7mo
4
20
How far along Metr's law can AI start automating or helping with alignment research?
Q
Christopher King
7mo
Q
21