LESSWRONG
Wikitags
LW

Subscribe
Discussion0

METR (org)

Subscribe
Discussion0
Written by Ruby last updated 1st Jul 2024

Formerly ARC Evals

Posts tagged METR (org)
97METR's Observations of Reward Hacking in Recent Frontier Models
Daniel Kokotajlo
5d
9
10Review of METR’s public evaluation protocol
nahoj, JaimeRV
1y
0
241METR: Measuring AI Ability to Complete Long Tasks
Ω
Zach Stein-Perlman
2mo
Ω
104
153ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks
Ω
Beth Barnes
2y
Ω
12
108Clarifying METR's Auditing Role
Ω
Beth Barnes
1y
Ω
1
90Introducing METR's Autonomy Evaluation Resources
Megan Kinniment, Beth Barnes
1y
0
66Interpreting the METR Time Horizons Post
Ω
snewman
1mo
Ω
12
65METR is hiring!
Beth Barnes
1y
1
58Reactions to METR task length paper are insane
Cole Wyeth
2mo
43
40ARC Evals: Responsible Scaling Policies
Zach Stein-Perlman
2y
10
20Improved visualizations of METR Time Horizons paper.
LDJ
3mo
4
20How far along Metr's law can AI start automating or helping with alignment research?
Q
Christopher King
3mo
Q
21
16METR: AI models can be dangerous before public deployment
UnofficialLinkpostBot
4mo
0
14METR’s preliminary evaluation of o3 and o4-mini
Christopher King
2mo
7
5METR is hiring ML Research Engineers and Scientists
Xodarap
1y
0
Load More (15/15)
Add Posts