If I ask "What is atmospheric pressure on Planet Xylon" to a language model, a good answer would be something like "I don't know" or "This question seems fictional", which current SOTA LLM's do due to stronger RLHF, but not smaller LLMs like Llama-3.2-1b / Qwen-2.5-1b and their Instruct tuned variants. Instead they hallucinate and output confident-like incorrect answers. Why is that, are these models unable to tell that the question is fictional or they can't detect uncertainty and if they detect uncertainty why do they still hallucinate a wrong answer?
This question led me to research on epistemic uncertainty (uncertainty from lack of knowledge). Some related readings and previous work on uncertainty... (read 1473 more words →)