Thanks!
Yes, I completely agree with you that in-context learning (ICL) is the only new "ability" LLMs seem to be displaying. I also agree with you that they start computing only when we prompt.
There seems to be the impression that, when prompted, LLMS might do something different (or even orthogonal) to what the user requests (see, for example, Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure, report here by the BBC). We'd probably agree that this was careful prompt engineering (made possible by IC...
I am one of the authors - thank you for taking the time to go through and to summarise our paper!
About your question on the instructions vs inherent abilities:
Consider the scenario where we train a model on the task of Natural Language Inference, using a dataset like The Stanford Natural Language Inference (SNLI) Corpus. Suppose the model performs exceptionally well on this task. While we can now say that the model possesses the computational capability to excel in NLI, this doesn’t necessarily indicate that the model has developed inherent em...
Hi there,
I am one of the authors - thank you for your interest in this paper.
The focus of the paper is the discussion surrounding the "existential threat" as a result of latent hazardous abilities. Essentially, our results show that there is no evidence to believe that models are likely to have the ability to plan and reason independent of what they are explicitly required to do through their prompts.
Importantly, as mentioned in the paper, there remain other concerns regarding the use of LLMs: For example, the ease with which they can be ...
Just wanted to share that this work has now been peer-reviewed and accepted to ACL 2024.
arxiv has been updated with the published ACL version: https://arxiv.org/abs/2309.01809