Abstract
How could machines learn as efficiently as humans and animals? How could machines learn to reason and plan? How could machines learn representations of percepts and action plans at multiple levels of abstraction, enabling them to reason, predict, and plan at multiple time horizons? This position paper proposes an architecture and training paradigms with which to construct autonomous intelligent agents. It combines concepts such as configurable predictive world model, behavior driven through intrinsic motivation, and hierarchical joint embedding architectures trained with self-supervised learning.
Meta's Chief AI Scientist Yann Lecun lays out his vision for what an architecture for generally intelligent agents might look like.
My model of Eliezer winces when a proposal for AGI design is published rather than kept secret. Part of me does too.
One upshot though is it gives AI safety researchers and proponents a more tangible case to examine. Architecture-specific risks can be identified, and central concerns like inner alignment can be evaluated against the proposed architecture and (assuming they still apply) be made more concrete and convincing.
I'm still reading the LeCun paper (currently on page 9). One thing it's reminding me of so far is Steve Byrnes' writing on brain-like AGI (and related safety considerations): https://www.lesswrong.com/s/HzcM2dkCq7fwXBej8