x

LESSWRONG

LW

ncase — LessWrong

ncase

ncase

Message

67

2y

ncase

67

2y

Numberwang: LLMs Doing Autonomous Research, and a Call for Input

by eggsyntax and ncase

Summary Can LLMs science? The answer to this question can tell us important things about timelines to AGI. In this small pilot experiment, we test frontier LLMs on their ability to perform a minimal version of scientific research, where they must discover a hidden rule about lists of integers by...

Jan 16, 2025•73