x

LESSWRONG

LW

METR (org) — LessWrong

METR (org)

Edited by Ruby last updated 1st Jul 2024

Formerly ARC Evals

Add Posts

Posts tagged METR (org)

2

100METR's Observations of Reward Hacking in Recent Frontier Models

Daniel Kokotajlo

1y

9

2

97Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

1y

43

2

21AXRP Episode 47 - David Rein on METR Time Horizons

7mo

0

2

10Review of METR’s public evaluation protocol

2y

0

1

243METR: Measuring AI Ability to Complete Long Tasks

Zach Stein-Perlman

1y

106

1

153ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks

3y

12

1

148METR's Evaluation of GPT-5

GradientDissenter

1y

15

1

108Clarifying METR's Auditing Role

2y

1

1

90Introducing METR's Autonomy Evaluation Resources

Megan Kinniment, Beth Barnes

2y

0

1

70Interpreting the METR Time Horizons Post

1y

13

1

67Reactions to METR task length paper are insane

1y

43

1

65METR is hiring!

3y

1

1

64CoT May Be Highly Informative Despite “Unfaithfulness” [METR]

GradientDissenter

1y

3

1

40ARC Evals: Responsible Scaling Policies

Zach Stein-Perlman

3y

10

1

40Is METR Underestimating LLM Time Horizons?

andreasrobinson

6mo

6

Load More (15/24)

Add Posts