Megan Kinniment

Introducing METR's Autonomy Evaluation Resources

This is METR’s collection of resources for evaluating potentially dangerous autonomous capabilities of frontier models. The resources include a task suite, some software tooling, and guidelines on how to ensure an accurate measurement of model capability. Building on those, we’ve written an example evaluation protocol. While intended as a “beta”...

Mar 15, 202490

Bounty: Diverse hard tasks for LLM agents

by Beth Barnes and Megan Kinniment

Update 3/14/2024: This post is out of date. For current information on the task bounty, see our Task Development Guide. Summary METR (formerly ARC Evals) is looking for (1) ideas, (2) detailed specifications, and (3) well-tested implementations for tasks to measure performance of autonomous LLM agents. Quick description of key...

Dec 17, 202349

Send us example gnarly bugs

by Beth Barnes, Megan Kinniment, and Tao Lin

Update: We are no longer accepting gnarly bug submissions. However, we are still accepting submissions for our Task Bounty! Tl;dr: Looking for hard debugging tasks for evals, paying greater of $60/hr or $200 per example. METR (formerly ARC Evals) is interested in producing hard debugging tasks for models to attempt...

Dec 10, 202377

Steering Behaviour: Testing for (Non-)Myopia in Language Models

by Evan R. Murphy and Megan Kinniment

Authors' Contributions: Both authors contributed equally to this project as a whole. Evan did the majority of implementation work, as well as the work for writing this post. Megan was more involved at the beginning of the project, and did the majority of experiment design. While Megan did give some...

Dec 5, 202240

Recall and Regurgitation in GPT2

The first half of this post uses causal tracing to explore differences in how GPT2-XL handles completing cached phrases vs completing factual statements. The second half details my attempt to build intuitions about the high-level structure of GPT2-XL and is speculation heavy. Some familiarity with transformer architecture is assumed but...

Oct 3, 202243

Trying out Prompt Engineering on TruthfulQA

I try out "let's gather the relevant facts" as a zero-shot question answering aid on TruthfulQA. It doesn't help more than other helpful prompts. Possibly it might work better on more typical factual questions. This post could potentially be useful to people interested in playing with OpenAI's API, or who...

Jul 23, 202210

Megan Kinniment's Shortform

Jul 14, 20223

Megan Kinniment

Megan Kinniment

GPT-3 Catching Fish in Morse Code

Introducing METR's Autonomy Evaluation Resources

Send us example gnarly bugs

Bounty: Diverse hard tasks for LLM agents

Megan Kinniment

GPT-3 Catching Fish in Morse Code

Introducing METR's Autonomy Evaluation Resources

Send us example gnarly bugs

Bounty: Diverse hard tasks for LLM agents

Introducing METR's Autonomy Evaluation Resources

Bounty: Diverse hard tasks for LLM agents

Send us example gnarly bugs

Steering Behaviour: Testing for (Non-)Myopia in Language Models

Recall and Regurgitation in GPT2

Trying out Prompt Engineering on TruthfulQA

Megan Kinniment's Shortform