Allen Schmaltz, PhD
We briefly introduce, at a conceptual level, technical work for deriving robust estimators of the predictive uncertainty over large language models (LLMs), and we consider the implications for real-world deployments and AI policy.
The Foundational LLM Limitation: An Absence of Robust Estimators of Predictive Uncertainty
The limitations of unconstrained LLMs, which includes the more recent RL-based reasoning-token models, are readily evident to end-users. Hallucinations, highly-confident wrong answers, and related issues diminish their benefits in most real-world settings. The punchline is that the end-user has no means of knowing whether the output can be trusted, beyond carefully checking the output, which precludes model-based automation for most complex, multi-stage pipelines.
The foundational problem for all... (read 1356 more words →)
To add to this historical retrospective on interpretability methods: Alternatively, we can use a parameter decomposition of a bottleneck ("exemplar") layer over a model with non-identifiable parameters (e.g., LLMs) to make a semi-supervised connection to the observed data, conditional on the output prediction, re-casting the prediction as a function over the training set's labels and representation-space via a metric-learner approximation. How do we know that the matched exemplars are actually relevant, or equivalently, that the approximation is faithful to the original model? One simple (but meaningful) metric is whether the prediction of the metric-learner approximation matches the class of the prediction of the original model, and if they do not, the discrepancies... (read more)