When does rationality-as-search have nontrivial implications?

[-]Vaniver7yΩ6220

This seems broadly right to me, but it seems to me like metaheuristics (in the numerical optimization sense) are practical and have a structure like the one that you're describing. Neural architecture search is the name people are using for this sort of thing in contemporary ML.

What's different between them and the sort of thing you describe? Well, for one the softening is even stronger; rather than a performance-weighted average across all strategies, it's a performance-weighted sampling strategy that has access to all strategies (but will only actually evaluate a small subset of them). But it seems like the core strategy--be both doing object-level cognition and meta-level cognition about how you're doing object-level cognitive--is basically the same.

It remains unclear to me whether the right way to find these meta-strategies is something like "start at the impractical ideal and rescue what you can" or "start with something that works and build new features"; it seems like modern computational Bayesian methods look more like the former than the latter. When I think about how to describe human epistemology, it seems like computationally bounded Bayes is a promising approach (where probabilities change both by the standard updates among hypotheses that already exist, and new operations to be formalized to add or remove hypotheses; you want to be able to capture "Why didn't you assign high probability to X?" "Because I didn't think of it; now that I have, I do."). But of course I'm using my judgment that already works to consider adding new features here, rather than having built how to think out of rescuing what I can from the impractical ideal of how to think.

[-]nostalgebraist7yΩ470

But it seems like the core strategy--be both doing object-level cognition and meta-level cognition about how you're doing object-level cognitive--is basically the same.

It remains unclear to me whether the right way to find these meta-strategies is something like "start at the impractical ideal and rescue what you can" or "start with something that works and build new features"; it seems like modern computational Bayesian methods look more like the former than the latter.

I'd argue that there's usually a causal arrow from practical lore to impractical ideals first, even if the ideals also influence practice at a later stage. Occam's Razor came before Solomonoff; "change your mind when you see surprising new evidence" came before formal Bayes. The "core strategy" you refer to sounds like "do both exploration and exploitation," which is the sort of idea I'd imagine goes back millennia (albeit not in those exact terms).

One of my goals in writing this post was to formalize the feeling I get, when I think about an idealized theory of this kind, that it's a "redundant step" added on top of something that already does all the work by itself -- like taking a decision theory and appending the rule "take the actions this theory says to take." But rather than being transparently vacuous, like that example, they are vacuous in a more hidden way, and the redundant steps they add tend to resemble legitimately good ideas familiar from practical experience.

Consider the following (ridiculous) theory of rationality: "do the most rational thing, and also, remember to stay hydrated :)". In a certain inane sense, most rational behavior "conforms to" this theory, since the theory parasitizes on whatever existing notion of rationality you had, and staying hydrated is generally a good idea and thus does not tend to conflict with rationality. And whenever staying hydrated is a good idea, one could imagine pointing to this theory and saying "see, there's the hydration theory of rationality at work again." But, of course, none of this should actually count in the "hydration theory's" favor: all the real work is hidden in the first step ("do the most rational thing"), and insofar as hydration is rational, there's no need to specify it explicitly. This doesn't quite map onto the $R / S$ schema, but captures the way in which I think these theories tend to confuse people.

If the more serious ideals we're talking about are like the "hydration theory," we'd expect them to have the appearance of explaining existing practical methods, and of retrospectively explaining the success of new methods, while not being very useful for generating any new methods. And this seems generally true to me: there's a lot of ensemble-like or regularization-like stuff in ML that can be interpreted as Bayesian averaging/updating over some base space of models, but most of the excitement in ML is in these base spaces. We didn't get neural networks from Bayesian first principles.

[-]Richard_Ngo7y50

This was a very useful and well-explained idea. Strongly upvoted.

[-]DanielFilan7yΩ340

I think that this is a slightly wrong account of the case for Solomonoff induction. The claim is not just that Solomonoff induction predicts computable environments better than computable predictors, but rather that the Solomonoff prior is an enumerable semimeasure that is also a mixture over every enumerable semimeasure, and therefore predicts computable environments at least as well as any other enumerable semimeasure. So, using your notation, $R \in S = {all enumerable semimeasures}$ . It still fails as a theory of embedded agency, since it only predicts computable environments, but it's not true that we must only compare it to prediction strategies strictly weaker than itself. The paper (Non-)Equivalence of Universal Priors has a decent discussion of this.

[-]DanielFilan7yΩ230

Although it's also worth noting that as per Theorem 16 of the above paper, not all universally dominant enumerable semimeasures are versions of the Solomonoff prior, so there's the possibility that the Solomonoff prior only does well by finding a good non-Solomonoff distribution and mimicking that.

[-]cousin_it7y40

Not sure that theorem gives us very much. Yeah, a mixture of all programs must include some programs that stop without outputting anything, so M(empty string) must be strictly greater than M(0)+M(1). But we can also make a semimeasure where M(empty string)=1, M(0)=M(1)=1/2 by fiat, and otherwise defer to a mixture. So it can't itself be a mixture of all programs, but will be just as good for sequence prediction. That's all the theorem says. Basically, if a Swiss army knife solves all problems, we shouldn't be surprised by the existence of other tools (like a Swiss army knife with added fishing hook) that also solve all problems.

[-]DanielFilan7y10

Yes, it's true that the theorem doesn't show that there's anything exciting that's interestingly different from a universal mixture, just that AFAIK we can't disprove that, and the theorem forces us to come up with a non-trivial notion of 'interestingly different' if we want to.

[-]cousin_it7y40

AIXI-like agents can be embedded in uncomputable worlds. So I'm not sure your post has much to do with embeddedness. You're just pointing out that AIXI is a poor metaphor when there are resource constraints, no matter if the agent is embedded or not. Sure, I agree with that.

[-]nostalgebraist7y40

My argument isn’t specialized to AIXI — note that I also used LIA as an example, which has a weaker R along with a weaker S.

Likewise, if you put AIXI in a world whose parts can do uncomputable things (like AIXI), you have the same pattern one level up. Your S is stronger, with uncomptable strategies, but by the same token, you lose AIXI’s optimality. It’s only searching over computable strategies, and you have to look at all strategies (including the uncomputable ones) to make sure you’re optimal. This leads to a rule R distinct from AIXI, just as AIXI is distinct from a Turing machine.

I guess it’s conceivable that this hits a fixed point at this level or some higher level? That would be abstractly interesting but not very relevant to embeddedness in the kind of world I think I inhabit.

[-]cousin_it7y80

Have you seen papers like this one? Embedded AIXIs converge on Nash equilibrium against each other, that's optimal enough, you don't need to go up another level. I agree it's not very relevant to our world, but there's no difference in terms of embeddedness, the only difference is resource constraints.

[-]nostalgebraist7y80

I was not aware of these results -- thanks. I'd glanced at the papers on reflective oracles but mentally filed them as just about game theory, when of course they are really very relevant to the sort of thing I am concerned with here.

We have a remaining semantic disagreement. I think you're using "embeddedness" quite differently than it's used in the "Embedded World-Models" post. For example, in that post (text version):

In a traditional Bayesian framework, “learning” means Bayesian updating. But as we noted, Bayesian updating requires that the agent start out large enough to consider a bunch of ways the world can be, and learn by ruling some of these out.

Embedded agents need resource-limited, logically uncertain updates, which don’t work like this.

Unfortunately, Bayesian updating is the main way we know how to think about an agent progressing through time as one unified agent. The Dutch book justification for Bayesian reasoning is basically saying this kind of updating is the only way to not have the agent’s actions on Monday work at cross purposes, at least a little, to the agent’s actions on Tuesday.

Embedded agents are non-Bayesian. And non-Bayesian agents tend to get into wars with their future selves.

The 2nd and 4th paragraphs here are clearly false for reflective AIXI. And the 2nd paragraph implies that embedded agents are definitionally resource-limited. There is a true and important sense in which reflective AIXI can be "embedded" -- that was the point of coming up with it! -- but the Embedded Agency sequence seems to be excluding this kind of case when it talks about embedded agents. This strikes me as something I'd like to see clarified by the authors of the sequence, actually.

I think the difference may be that we talk about "a theory of rationality for embedded agents," we could mean "a theory that has consequences for agents equally powerful to it," or we could mean something more like "a theory that has consequences for agents of arbitrarily low power." Reflective AIXI (as a theory of rationality) explains why reflective AIXI (as an agent) is optimally designed, but it can't explain why a real-world robot might or might not be optimally designed.

[-]Logan Zoellner1y20

It seems fair to worry about smuggling hypercomputation into optimal agents via assuming global optimization (due to the intractability of the halting problem). But there are plenty of domains where the global optima is just the asymptotic limit of local search.

For example suppose I have to predict the outcome of a biased coin flip, observing a bunch of actual coin flips and then computing p(heads) based off of that converges just fine to the right answer.

Even in Turing-complete domains, search sort of just works even though the Halting problem tells us proving we've found an optimal solution may be impossible.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

72

When does rationality-as-search have nontrivial implications?

72

Ω 17

72

Ω 17