x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
METR (org) — LessWrong
METR (org)
Edited by
Ruby
last updated
1st Jul 2024
Formerly ARC Evals
Subscribe
Discussion
Subscribe
Discussion
Posts tagged
METR (org)
Most Relevant
2
100
METR's Observations of Reward Hacking in Recent Frontier Models
Daniel Kokotajlo
6mo
9
2
97
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
habryka
5mo
43
2
10
Review of METR’s public evaluation protocol
nahoj
,
JaimeRV
1y
0
1
242
METR: Measuring AI Ability to Complete Long Tasks
Ω
Zach Stein-Perlman
8mo
Ω
106
1
153
ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks
Ω
Beth Barnes
2y
Ω
12
1
141
METR's Evaluation of GPT-5
Ω
GradientDissenter
4mo
Ω
15
1
108
Clarifying METR's Auditing Role
Ω
Beth Barnes
2y
Ω
1
1
90
Introducing METR's Autonomy Evaluation Resources
Megan Kinniment
,
Beth Barnes
2y
0
1
70
Interpreting the METR Time Horizons Post
Ω
snewman
8mo
Ω
13
1
67
Reactions to METR task length paper are insane
Cole Wyeth
8mo
43
1
65
METR is hiring!
Beth Barnes
2y
1
1
64
CoT May Be Highly Informative Despite “Unfaithfulness” [METR]
Ω
GradientDissenter
4mo
Ω
3
1
40
ARC Evals: Responsible Scaling Policies
Zach Stein-Perlman
2y
10
1
30
Improved visualizations of METR Time Horizons paper.
LDJ
9mo
4
1
20
How far along Metr's law can AI start automating or helping with alignment research?
Q
Christopher King
9mo
Q
21