harrymayne

Message

harrymayne

LLMs Don't Know Their Own Decision Boundaries. Why Is This Important?

TL;DR: We examine whether LLMs can explain their own behaviour with counterfactuals, i.e. can they give you scenarios under which they would have acted differently. We find they can't reliably provide high-quality counterfactuals. This is (i) a weird and interesting failure mode, and (ii) should be a concern for high-stakes...

Sep 17, 20259

Are recent LLMs better at reasoning or better at memorizing?

TLDR; By carefully designing a reasoning benchmark that counteracts memorization skills in LLMs, LingOly-TOO (L2) Benchmark challenges frontier models with unseen questions and answers and makes the case that LLMs are not consistent reasoning machines yet. Links: Paper - Leaderboard - Dataset Figure 1: LingOly-TOO Benchmark results from the paper....

Mar 7, 202511

LESSWRONG
LW

LESSWRONG
LW

harrymayne

harrymayne

harrymayne

LLMs Don't Know Their Own Decision Boundaries. Why Is This Important?

Are recent LLMs better at reasoning or better at memorizing?

harrymayne

harrymayne

harrymayne

LLMs Don't Know Their Own Decision Boundaries. Why Is This Important?

Are recent LLMs better at reasoning or better at memorizing?

The importance of self-explanations

Do recently announced LLMs reason?