On some first principles of intelligence

The progress and success of AI research in the past decade seem all point back to some very basic principles about the nature of intelligence discovered last century. These include 1) Feedback mechanism by Robert Wiener, 2) Information theory by Claude Shannon, 3) Universal computation by Alan Turing:

1) Feedback in a very general sense is an abstraction of information close-loop to ensure continual improvement/adaptation of an intelligent sub-system.

Humans and animals need feedback to improve skill-level of doing something. A product/technique needs real-world feedback signal to iteratively improve to be usable. Any useful autonomous system is closed-loop feedback system. From this perspective, it is unsurprising that open-loop self-supervised/imitation learning alone does not produce satisfactory results, as there is no feedback about the agent's performance in the real world while this information is necessary for improving/alignment. LLM trained via open loop self-supervised learning, despite unlocking amazing capabilities like ICL and CoT, does not know how to properly respond to humans. Similarly, lack of information feedback is one of the main reasons that advances in autonomous driving and robot learning are slow. Driving policy learned in a purely open-loop manner by imitating human driver demonstration has a much higher collision rate than that of the demonstration. It can only be improved by incorporating RL loss or any mechanism equivalent to DAgger, because without that it lacks the information of "an unsafe/collision state".

From this perspective, an AGI must be an agent that can autonomously interact with the environment to collect feedback on its behavior through observation, and continuously process and incorporate this information through some mechanism (e.g., RL as for AIXI, neural plasticity/hebbian learning/self-critique as for humans and animals, rewriting its source code as for Godel machine).

2) Information entropy is closely related to thermodynamic entropy via Landauer's principle, and together obeys the second law of thermal dynamics. As such, a closed system (e.g. a machine learning algorithm) without any information feedback from the external environment (adding more data, human's knowledge) typically cannot improve its intelligence level (entropy reduction).

The most used approach to bridging the information feedback loop is through "information backpropagation through engineer's brain", and one of the most disguised forms of this approach is "doing research to add various inductive bias in the form of new models/algorithms". However, as Richard Sutton pointed out, the only approaches that scale are search (computation) and learning (information from data or form a self-play game through computation). All the inductive biases turn out to be bottlenecking the performance as the model/data size scales up. This is also no surprise: "information backpropagation through human's brain" is not a very efficient way to improve intelligence, because 1) human cannot extract information from a large amount of raw data; 2) human relies on System 2 thinking to improve algorithms, despite being flexible, System 2 is serial and slow. As such, human becomes the major bottleneck of the information close-loop underlying the algorithm iteration process. As a result, developing rule-based planning algorithm for autonomous driving is doomed to fail. Until the whole community gets enough feedback from the progress of these fields and starts to focus on accelerating the information feedback loop (data fly-wheel), we start making real progress. From the information-entropy perspective, an intelligent embodied agent is a low-entropy agent, and therefore, reducing entropy through adding information (data) and energy (computation) is the only scalable way to achieve that.

3) Computation is information processing, which establishes information flow. Training a neural network is a process of extracting information from observation of the world/data into biological/artificial neural networks. Computation can also convert energy into information-entropy reduction, accompanied by corresponding increases in thermal dynamics entropy (Landauer's principle).

Human's thinking process is a certain form of computation, which consumes energy and typically reduces the information entropy of the posterior distribution over all the possible world-models conditioned on all the observations (in the very general sense, including all the sensed signals and decoded messages such as natural language). Machines have great advantages in terms of computation over humans and animals, largely attributed to the advantage of energy supply. System 2 intelligence is energy-consuming, and humans can only exploit it in a slow single-thread process through the global workspace/attention mechanism. In contrast, there are a huge number of parallel attention layers within a transformer, and arguably that might be the reason why GPT can generate logically coherent texts rapidly at scale.

It can be readily seen that almost all the capabilities considered as important aspects of intelligence studied in the AI community can be reduced to the aforementioned principles in some way. But what about generalization? What is generalization and why does LLM show strong OOD generalization capability? To answer this question, let's look at how humans generalize. Arguably, humans generalize by combining induction and deduction. Induction is the process of proposing hypotheses about the world that can explain the observed, while deduction is the process of deriving predictions about the unobserved through systematic logic reasoning based on the induced hypotheses. Formally speaking, optimal induction is inferring the posterior over all the possible theory/program/model of the world conditioned on the observed dataset with the universal prior of preferring hypotheses of lower Kolmogorov complexity (Occam's razor, Solomonoff's theory of inductive inference), while optimal deduction is marginalizing the world model and deriving the posterior over the unobserved dataset. However, Solomonoff's induction is uncomputable, thus intractable to execute exactly. I argue that humans are conducting approximate Solomonoff's induction by rejection sampling: first proposing plausible hypotheses based on intuition (System 1 thinking), and then applying logic reasoning (reducible to search, thus computation) to systematically rejecting (System 2 thinking) candidate hypotheses that are incompatible with the observed data. Researchers have shown that SGD approximates Bayesian inference, which is closely related to Solomonoff's induction and arguably explains the strong generalization capability of LLM (theory/program/model are merely high-level abstractions/conceptualization, while language is also a system of hierarchical abstraction grounded on all the sensing modalities we perceive from the external world).

The first principles of intelligence are fundamental and irreducible, as they are tied to fundamental physical laws. However, one of the main reasons that life and intelligence are difficult to be deducted and understood merely from first principles is that reductionism is sometimes limited. Any form of intelligence is complex dynamic system, with universal features including: 1) feedback/entropy flow at multiple hierarchical scales, sometimes resulting in strong emergence of both bottom-up and top-down causality; 2) self-orgianizing, resulting in i) emergent self-similarity and the fractal/scale-free structure, thinking about the organization from molecules to cells, to organs, to individual humans, to companies, to the whole human society, ii) phase transition and, as Sam Altman pointed out, the Moore's law of everything. These empirical observations call for a new theory for understanding emergent behavior, a theory for relationship.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

-14

On some first principles of intelligence

-14

-14