x

LESSWRONG

LW

Jadair — LessWrong

Jadair

Jadair

Message

Engineering leader working to deploy AI safely in critical systems.

9

1

8

7mo

Jadair

Engineering leader working to deploy AI safely in critical systems.

A Black-Box Procedure for LLM Confidence in Critical Applications

TL;DR: LLM self-reported confidence doesn't correlate with accuracy. A simple two-step black-box procedure can do much better. First, ask the model a simple verifiable question related to your topic with web search off. If the LLM can't answer, don't trust it on that topic. If it can answer, turn web...

A Risk-Informed Framework for AI Use in Critical Applications

Twenty years in aerospace have taught me the importance of understanding the limitations of our engineering models. We build systems where a single failure mode such as combustion instability in a rocket engine or resonance in a wing can be catastrophic. The extreme environment aerospace systems operate in demands rigor,...