Gradient descent doesn't select for inner search

Does it need to be either "pure search" or "no search"?

My expectation would be that in the limit it learns a ton of heuristics about what usually works, and learns to do a much more efficient search using those heuristics. This would especially be the case if e.g. capabilities researchers give the nets extra options to speed up the search (for instance, it's totally possible to embed a few steps of gradient descent into the inference of a neural network, since gradient descent is differentiable - I don't know if that will eventually be shown to help improve capabilities, but if it will, capabilities researchers would presumably do it).

[-]Ivan Vendrov3y*30

Agreed that "search" is not a binary but more like a continuum, where we might call a program more "search-like" if it is enumerating possible actions and evaluating their consequences, and less "search-like" if it is directly mapping representations of inputs to actions. The argument in this post is that gradient descent (unlike evolution, and unlike human programmers) doesn't select much for "search-like" programs. If we take depth-first search as a central example of search, and a thermostat as the paradigmatic non-search program, gradient descent will select for something more like the thermostat.

it's totally possible to embed a few steps of gradient descent into the inference of a neural network, since gradient descent is differentiable

Agreed, and networks may even be learning something like this already! But in my ontology I wouldn't call an algorithm that performs, say, 5 steps of gradient descent over a billion-parameter space and then outputs an action very "search-like"; the "search" part is generating a tiny fraction of the optimization pressure, relative to whatever process sets up the initial state and the error signal.

Maybe this is just semantics, because for high levels of capability search and control are not fundamentally different (what you're pointing to with "much more efficient search" - an infinitely efficient search is just optimal control, you never even consider suboptimal actions!). But it does seem like for a fixed level of capabilities search is more brittle, somehow, and more likely to misgeneralize catastrophically.

[-]Ramana Kumar3yΩ450

How does this square with Are minimal circuits deceptive?

[-]Ivan Vendrov3y*Ω670

Mostly orthogonal:

Evan's post argues that if search is computationally optimal (in the sense of being the minimal circuit) for a task, then we can construct a task where the minimal circuit that solves it is deceptive.
This post argues against (a version of) Evan's premise: search is not in fact computationally optimal in the context of modern tasks and architectures, so we shouldn't expect gradient descent to select for it.

Other relevant differences are

gradient descent doesn't actually select for low time complexity / minimal circuits; it holds time & space complexity fixed, while selecting for low L2 norm. But I think you could probably do a similar reduction for L2 norm as Evan does for minimal circuits. The crux is in the premise.
I think Evan is using a broader definition of search than I am in this post, closer to John Wentworth's definition of search as "general problem solving algorithm".
Evan is doing worst-case analysis (can we completely rule out the possibility of deception by penalizing time complexity?) whereas I'm focusing on the average or default case.

[-]johnswentworth3y30

What exactly do you think the word "search" means?

[-]Ivan Vendrov3y20

See my answer to tailcalled:

a program is more "search-like" if it is enumerating possible actions and evaluating their consequences

I'm curious if you mean something different by search when you say that we're likely to find policies that look like an "explicit search process + simple objective(s)"

[-]johnswentworth3y74

Yeah, that's definitely not what I mean by search (nor what I think others mean by search, in the context of AI and inner agents).

Roughly speaking, a general search process is something which takes in a specification of some problem or objective (from a broad class of possible problems/objectives), and returns a plan which solves the problem or scores well on the objective. For instance, a gradient descent algorithm takes in an objective, and returns a point which scores well on the objective, for a very broad class of possible objectives; gradient descent is therefore a search method.

Enumerating possible actions and evaluating their consequences is one way to do general search, but it's wildly inefficient; I would typically refer to that as "brute force search". Gradient descent does better by leveraging backprop and gradients; approximately none of the algorithmic work done by gradient descent comes from direct evaluation of the consequences of actions. And there are many other tricks one can use too - like memoization on subsearches, or A*-style heuristic search, or (one meta-level up from A*) relaxation-based methods to discover heuristics. The key point is that these tricks are all very general purpose: they work on a very wide variety of search problems, and therefore produce general-purpose search algorithms which are more efficient than brute force (at least on realistic problems).

More advanced general-purpose search methods seem to rely relatively little on enumerating possible actions and evaluating their consequences. By the time we get to human-level search capabilities, we see human problem-solvers spend most of their effort on nontrivial problems thinking about subproblems, abstractions and analogies rather than thinking directly about particular solutions.

[-]Ivan Vendrov3y*1512

I agree that A* and gradient descent are central examples of search; for realistic problems these algorithms typically evaluate the objective on millions of candidates before returning an answer.

In contrast, human problem solvers typically do very little state evaluation - perhaps evaluating a few dozen possibilities directly, and relying (as you said) on abstractions and analogies instead. I would call this type of reasoning "not very search-like".

On the far end we have algorithms like Gauss-Jordan elimination, which just compute the optimal solution directly without evaluating any possibilities. Calling them "search algorithms" seems quite strange.

a general search process is something which takes in a specification of some problem or objective (from a broad class of possible problems/objectives), and returns a plan which solves the problem or scores well on the objective.

This appears to be a description of any optimization process, not just search - in particular it would include Gauss-Jordan elimination. I guess your ontology has "optimization" and "search" as synonyms, whereas mine has search as a (somewhat fuzzy) proper subset of optimization. Anyways, to eliminate confusion I'll use Abram Demski's term selection in future. Also added a terminology note to the post.

[-]johnswentworth3y40

My ontology indeed has search and a narrow notion of optimization as approximately synonyms; they differ only somewhat in type signature and are easily interchangeable. Conceptually, both take in an objective, and return something which scores highly on the objective. (This is narrower than e.g. Flint's notion of "optimization"; in that ontology it might be called a "general-purpose optimizer" instead.)

Anyway, insofar as any of this is relevant to the arguments for mesa-optimization, it's the notion of search/optimization as general problem solving which applies there.

[-]Quintin Pope3y72

Roughly speaking, a general search process is something which takes in a specification of some problem or objective (from a broad class of possible problems/objectives), and returns a plan which solves the problem or scores well on the objective.

This description of “search” seems far too broad. E.g., it seems to include things like lookup tables, and AFAICT, literally every way to solve problems or satisfy objectives?

Seems like “search” should mean something more specific than “problem solving”.

[-]johnswentworth3y42

It excludes methods specific to a small number of problems. Search is about general problem solving.

Anyway, IIUC this is how the term "search" has historically been used in AI, it is also the notion of "search" which is relevant to the arguments for mesaoptimization in Risks From Learned Optimization, it is also the notion of search which is relevant to general intelligence being A Thing, it is the notion of search which is relevant to the possibility of an ML system suddenly grokking "general search" and thereby undergoing a rapid increase in capabilities, etc.

[-]Vladimir_Nesov3y30

we will select for "explicit search processes with simple objectives"

The actual argument is that small descriptions give higher parameter space volume, and so the things we find are those with short descriptions (low Kolmogorov complexity). The thing with a short description is the whole mesa-optimizer, not just its goal. This is misleading for goals because low Kolmogorov complexity doesn't mean low "complexity" in many other senses, so an arbitrary goal with low Kolmogorov complexity would actually be much more "complicated" than intended base objective. In particular, it probably cares about the real world outside an episode and is thus motivated to exhibit deceptively aligned behavior.

I think "explicit search" is similarly misleading, because most short programs (around a given behavior) are not coherent decision theoretic optimizers. Search would only become properly explicit after the mesa-optimizer completes its own agent foundations alignment research program and builds itself a decision theory based corrigible successor. A mesa-optimizer only needs to pursue the objective of aligned behavior (or whatever it's being selected for), and whether that tends to be its actual objective or the instrumental objective of deceptive alignment is a toss-up, either would do for it to be selected. But in either case, it doesn't need to be anywhere ready to pursue a coherent goal of its own (as aligned behavior is also not goal directed behavior in a strong sense).

[-]Ivan Vendrov3y120

Agreed on "explicit search" being a misleading phrase, I'll replace it with just "search" when I'm referring to learned programs.

small descriptions give higher parameter space volume, and so the things we find are those with short descriptions

I don't think I understand this. GPT-3 is a thing we found, which has 175B parameters, what is the short description of it?

[-]Vladimir_Nesov3y20

I mean relatively short, as in the argument for why overparametrized models generalize. They still do get to ~memorize all training data, but anything else comes at a premium, reduces probability of getting selected for models whose behavior depends on those additional details. (This use of "short" as meaning "could be 500 gigabytes" was rather sloppy/misleading of me, in a comment about sloppy/misleading use of words...)

[-]Ivan Vendrov3y30

Ah I think that's the crux - I believe the overparametrized regime finds generalizing models because gradient descent finds functions that have low function norm, not low description length. I forget the paper that showed this for neural nets but here's a proof for logistic regression.

[-]Vladimir_Nesov3y*20

I'm thinking of a setting where shortest descriptions of behavior determine sets of models that exhibit matching behavior (possibly in a coarse-grained way, so distances in behavior space are relevant). This description-model relation could be arbitrarily hard to compute, so it's OK for shortest descriptions to be shortest programs or something ridiculous like that. This gives a partition of the model/parameter space according to the mapping from models to shortest descriptions of their behavior. I think shorter shortest descriptions (simpler behaviors) fill more volume in the parameter/model space, have more models whose behavior is given by those descriptions (this is probably the crux; e.g. it's false if behaviors are just models themselves and descriptions are exact).

Gradient descent doesn't interact with descriptions or the description-model relation in any way, but since it selects models ~based on behavior, and starts its search from a random point in the model space, it tends to select behaviors from larger elements of the partition of the space of models that correspond to simpler behaviors with shorter shortest descriptions.

This holds at every step of gradient descent, not just when it has already learned something relevant. The argument is that whatever behavior is selected, it is relatively simple, compared to other behaviors that could've been selected by the same selection process. Further training just increases the selection pressure.

[-]Ivan Vendrov3y40

Yeah I think you need some additional assumptions on the models and behaviors, which you're gesturing at with the "matching behaviors" and "inexact descriptions". Otherwise it's easy to find counterexamples: imagine the model is just a single N x N matrix of parameters, then in general there is no shorter description length of the behavior than the model itself.

Yes there are non-invertible (you might say "simpler") behaviors which each occupy more parameter volume than any given invertible behavior, but random matrices are almost certainly invertible so the actual optimization pressure towards low description length is infinitesimal.

[-]Lauro Langosco3y20

I think you overestimate the importance of the genomic bottleneck. It seems unlikely that humans would have been as successful as we are if we were... the alternative to the kind of algorithm that does search, which you don't really describe.

Performing search to optimize an objective seems really central to our (human's) capabilities, and if you want to argue against that I think you should say something about what an algorithm is supposed to look like that is anywhere near as capable as humans but doesn't do any search.

[-]Ivan Vendrov3y83

I disagree that performing search is central to human capabilities relative to other species. The cultural intelligence hypothesis seems much more plausible: humans are successful because our language and ability to mimic allow us to accumulate knowledge and coordinate at massive scale across both space and time. Not because individual humans are particularly good at thinking or optimizing or performing search. (Not sure what the implications of this are for AI).

You're right though, I didn't say much about alternative algorithms other than point vaguely in the direction of hierarchical control. I mostly want to warn people not to reason about inner optimizers the way they would about search algorithms. But if it helps, I think AlphaStar is a good example of an algorithm that is superhuman in a very complex strategic domain but is very likely not doing anything like "evaluating many possibilities before settling on an action". In contrast to AlphaZero (with rollouts), which considers tens of thousands of positions before selecting an action. AlphaZero (just the policy network) I'm more confused about... I expect it still isn't doing search, but it is literally trained to imitate the outcome of a search so it might have similar mis-generalization properties?

[-]Lauro Langosco3y30

(Note that I'm not making a claim about how search is central to human capabilities relative to other species; I'm just saying search is useful in general. Plausibly also for other species, though it is more obvious for humans)

From my POV, the "cultural intelligence hypothesis" is not a counterpoint to importance of search. It's obvious that culture is important for human capabilities, but it also seems obvious to me that search is important. Building printing presses or steam engines is not something that a bundle of heuristics can do, IMO, without gaining those heuristics via a long process of evolutionary trial-and-error. And it seems important that humans can build steam engines without generations of breeding better steam-engine-engineers.

Re AlphaStar and AlphaZero: I've never played Starcraft, so I don't have good intuitions for what capabilities are needed. But on the definitions of search that I use, the AlphaZero policy network definitely performs search. In fact out of current systems it's probably the one that most clearly performs search!

...Now I'm wondering whether our disagreement just comes from having different definitions of search in mind. Skimming your other comments above, it seems like you take a more narrow view of search = literally iterating through solutions and picking a good one. This is fine by me definitionally, but I don't think the fact that models will not learn search(narrow) is very interesting for alignment, or has the implications that you list in the post? Though ofc I might still be misunderstanding you here.

[-]Ivan Vendrov3y10

Yeah it's probably definitions. With the caveat that I don't mean the narrow "literally iterates over solutions", but roughly "behaves (especially off the training distribution) as if it's iterating over solutions", like Abram Demski's term selection.

[-]Vladimir_Nesov3y20

AlphaZero (just the policy network) I'm more confused about... I expect it still isn't doing search, but it is literally trained to imitate the outcome of a search so it might have similar mis-generalization properties?

This suggests that the choice of decision theory that amplifies a decision making model (in the sense of IDA/HCH, or just the way MCTS is used in training AlphaZero) might influence robustness of its behavior far off-distribution, even if its behavior around the training distribution is not visibly sensitive to choice of decision theory used for amplification.

Though perhaps this sense of "robustness" is not very appropriate, and a better one should be explicitly based on reflection/extrapolation from behavior in familiar situations, with the expectation that all models fail to be robust sufficiently far off-distribution (in the crash space), and new models must always be prepared in advance of going there.

[-]Noosphere893y*11

My thinking is that one of the biggest reasons humans managed to dominate is basically 3x more brainpower combined with ways to get rid of the heat necessary to support brainpower, which requires sweating all over the body.

Essentially it's the scaling hypothesis applied to biological systems.

And since intelligence can be used for any goal, it's not surprising that intelligence's main function was cultural.

^{^}

Unclear how true this is - in the limit of capabilities, both search and control processes could be lethal. But intuitively, for a fixed level of capabilities, it seems like search processes are more brittle and likely to mis-generalize catastrophically than control processes.

LESSWRONG
LW

LESSWRONG
LW

47

Gradient descent doesn't select for inner search

47

Ω 24

47

Ω 24

Recap: compression argument for inner optimizers

Compactness, Complexity, and Compute

(Speculative) Implications for interpretability and safety