Eliciting base models with simple unsupervised techniques
Authors: Aditya Shrivastava*, Allison Qi*, Callum Canavan*, Tianyi Alex Qiu, Jonathan Michala, Fabien Roger (*Equal contributions, reverse alphabetical) Wen et al. introduced the internal coherence maximization (ICM) algorithm for unsupervised elicitation of base models. They showed that for several datasets, training a base model on labels generated by their algorithm gives similar test accuracy to training on golden labels. To understand which aspects of ICM are most useful, we ran a couple of simple unsupervised elicitation methods that leverage some of the factors that might make ICM work. We compared these baseline methods to training on golden labels for both in-context learning and iterative fine-tuning, using the same datasets as Wen et al. and similar hyperparameters. We find that: * Just using few-shot prompts with random labels recovers 53–93% of the gap between zero-shot accuracy and many-shot prompting with golden labels, and iterative fine-tuning on labels created with this baseline recovers 62–96% of the gap between untrained models and golden fine-tuned models. * The most useful aspects of ICM are * bootstrapping (using predictions from one iteration of few-shot prompting as few-shot examples in the next iteration) * enforcing logical consistency of predictions. * A simple method which combines these recovers 83–100% of the gap between zero-shot accuracy and many-shot prompting with golden labels, and iterative fine-tuning with this method recovers 91–99% of the gap between untrained models and golden fine-tuned models. * These results do not hold if we increase the size of the training set from ~2k data points (as in Wen et al.) to ~30k data points: golden fine-tuning performance increases with dataset size more than unsupervised elicitation performance. * This makes sense, as larger fine-tuning runs likely teach the model something new, they don’t just elicit existing capabilities. There is no strong reason to expect these simple t