Implications of Continual Learning for LLM Agents
Many people think that continual learning (CL) is a key missing capability of LLM systems, and we think its development could have huge implications for the capabilities and safety of AI agents. Despite this, several important questions about CL remain underexplored:
- What counts as continual learning? Through what pathways might LLM agents acquire CL capabilities? Which limitations of current agents would effective CL mitigate?
- How might CL affect safety and alignment? Which threat models do we need to look out for, and which of the current safety techniques will predictably degrade as agents become stronger continual learners? In what deployment settings might the risks materialize?
- What are some angles of attack for making CL agents safer today, given our substantial uncertainty about the shape those CL agents will take?
Our sequence aims to tackle all of these questions and more.
This sequence is by Rohan Subramani*, Rauno Arike*, Owen Terry, Achu Menon, Zhijing Jin, Francis Rhys Ward, and Seth Herd.