No discussion of AI capabilities today is complete without referencing the famous METR graph, but what is it actually measuring? In this session we will dive into the nitty gritty of measuring AI time horizons, discussing METRs methodology and its limitations.
To prepare for the session, please read:
The original METR paper (March 19th, 2025): link ( ~15min read)
Notes from a HCAST task contributor (includes METR author response in comments) (May 5th, 2025): link (~10min read)
Against the METR graph (Nathan Witkins, Jan 20th, 2026) link (~25min read)
Thomas Kwa on what the METR graph does and doesn't show (Jan 22nd, 2026): link (~15 minutes read)
METRs latest update (Jan 29th, 2026): link (~15min read)
No discussion of AI capabilities today is complete without referencing the famous METR graph, but what is it actually measuring?
In this session we will dive into the nitty gritty of measuring AI time horizons, discussing METRs methodology and its limitations.
To prepare for the session, please read:
The original METR paper (March 19th, 2025): link ( ~15min read)
Notes from a HCAST task contributor (includes METR author response in comments) (May 5th, 2025): link (~10min read)
Against the METR graph (Nathan Witkins, Jan 20th, 2026) link (~25min read)
Thomas Kwa on what the METR graph does and doesn't show (Jan 22nd, 2026): link (~15 minutes read)
METRs latest update (Jan 29th, 2026): link (~15min read)
Posted on: