No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes
TLDR: This is the abstract, introduction and conclusion to the paper. See here for a summary thread. Abstract Do large language models (LLMs) anticipate when they will answer correctly? To study this, we extract activations after a question is read but before any tokens are generated, and train linear probes...