throwaway_2025 — LessWrong

Capabilities benchmarks can be highly useful in safety applications. You raised a great example with ML benchmarks. Strong ML R&D capabilities lie upstream of many potential risks:

Labs may begin automating research, which could shorten timelines.
These capabilities may increase proliferation risks of techniques used to develop frontier models.
In the extremes, these capabilities may increase the risk of uncontrolled recursive self-improvement.

Labs, governments, and everyone else involved should have an accurate understanding of where the capabilities frontier lies to enable good decision making. The only quantitatively rigorous way of doing that is with good benchmarks.

Capabilities are not bottlenecked on benchmarks to inform where model developers could make improvements, and adding more is extremely unlikely to make any significant difference to capabilities progress.

Therefore, I think having more capabilities benchmarks a good thing because it can greatly increase our understanding of model capabilities without making much of a difference in timelines. However, if you are interested in doing safety work, building capabilities benchmarks is probably not the most effective thing you could be doing.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments