Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
This is a linkpost for https://arxiv.org/abs/2301.07608
Did anyone else see this?
What learning algorithm is in-context learning? Investigations with linear models
Toolformer: Language Models Can Teach Themselves to Use Tools
Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom
(Submitted: 9 Feb 2023)
This paper shows that LLM could appropriate arbitrary models (including optimisation models, such as search algorithms) as affordances.
Human-Timescale Adaptation in an Open-Ended Task Space
Adaptive Agent Team, Jakob Bauer, Kate Baumli, Satinder Baveja, Feryal Behbahani, Avishkar Bhoopchand, Nathalie Bradley-Schmieg, Michael Chang, Natalie Clay, Adrian Collister, Vibhavari Dasagi, Lucy Gonzalez, Karol Gregor, Edward Hughes, Sheleem Kashem, Maria Loks-Thompson, Hannah Openshaw, Jack Parker-Holder, Shreya Pathak, Nicolas Perez-Nieves, Nemanja Rakicevic, Tim Rocktäschel, Yannick Schroecker, Jakub Sygnowski, Karl Tuyls, Sarah York, Alexander Zacherl, Lei Zhang
(Submitted: 18 Jan 2023)
This paper blows through the result of "In-context Reinforcement Learning with Algorithm Distillation" (see also: Sam Marks' "Caution when interpreting Deepmind's In-context RL paper") and is a powerful mesa-optimisation however you look at it.