LESSWRONG
Wikitags
LW

Subscribe
Discussion0

METR (org)

Subscribe
Discussion0
Written by Ruby last updated 1st Jul 2024

Formerly ARC Evals

Posts tagged METR (org)
2
10Review of METR’s public evaluation protocol
nahoj, JaimeRV
11mo
0
1
241METR: Measuring AI Ability to Complete Long Tasks
Ω
Zach Stein-Perlman
1mo
Ω
104
1
153ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks
Ω
Beth Barnes
2y
Ω
12
1
108Clarifying METR's Auditing Role
Ω
Beth Barnes
1y
Ω
1
1
90Introducing METR's Autonomy Evaluation Resources
Megan Kinniment, Beth Barnes
1y
0
1
66Interpreting the METR Time Horizons Post
Ω
snewman
19d
Ω
12
1
65METR is hiring!
Beth Barnes
1y
1
1
56Reactions to METR task length paper are insane
Cole Wyeth
1mo
41
1
40ARC Evals: Responsible Scaling Policies
Zach Stein-Perlman
2y
10
1
20Improved visualizations of METR Time Horizons paper.
LDJ
2mo
4
1
20How far along Metr's law can AI start automating or helping with alignment research?
Q
Christopher King
2mo
Q
21
1
16METR: AI models can be dangerous before public deployment
UnofficialLinkpostBot
3mo
0
1
13METR’s preliminary evaluation of o3 and o4-mini
Christopher King
1mo
4
1
5METR is hiring ML Research Engineers and Scientists
Xodarap
1y
0
Add Posts