Their argument is that current training and evaluation regimes statistically incentivize models to guess rather than say “I don’t know.” Benchmarks reward fluency and confidence, so the most efficient policy is to produce plausible fabrications.

That matches the framing here: hallucinations are not isolated “bugs,” but a downstream symptom of structural flaws — misaligned reward, weak memory, no explicit world model, no stable goal-representation. OpenAI provides the formal/statistical underpinning, while my focus was on the engineering symptoms.

Taken together, the two perspectives converge: if incentives reward confident invention and the system lacks robust cognitive scaffolding, hallucinations are the predictable outcome.

Reply

[-]niplav3mo71

Very good post, thank you. Strong upvote. I liked the clear writing from SE experience, seems to point at real gaps of LLMs, even in whatever harness. I'd've appreciated more battle stories and case reports over the sometimes somewhat fluffy LinkedIn-type language.

Reply

[-]Gunnar_Zarncke3mo30

Predictions

How confident are you in each of these predictions? The way they are worded sounds pretty confident (80%?).

Reply

[-]Michal Barodkin3mo30

Short answer: not 80%. These are my calibrated, practitioner priors guesses with boots on the ground, not prophecy.

Episodic memory becomes default in a top-3 open-source agent framework by 2026-06 - ~40%
People are building episodic stores now, but making them the default with sane UX, consolidation semantics, and community buy-in is a bigger product and governance task than the papers imply.
≥2 major commercial LLM APIs expose calibrated token/span uncertainty by 2026-12 - ~35%
Teams want uncertainty measures, but shipping well-calibrated, production-useful scores at token/span granularity is fiddly and product-risky.
“Decision budgets” standard in >50% of production-grade deployments discussed at ML-eng venues by 2027-01 ~80%. “Decision budgets” as a neat academic name might look new on a paper slide, but the idea is already everywhere in production - it’s just called different things and hidden inside engineering plumbing.
A model-based world representation shipped into a widely used autonomous coding agent and raises multi-step success by >20% by 2027-06 ~50%

If none of the above directions are broadly adopted by 2028-01, that would be strong evidence against my view.

Reply

Moderation Log