Consider how actors lacking some previously-assumed perfection can approach that perfection in some limit (asymptotic performance / equilibrium / ...). A big reason to care about such limit properties is to undergird arguments about performance in the real world. For example, the big O performance of an algorithm is used (with caveats) for anticipating performance on large amounts real-world data.

Sometimes, when we're doing conceptual cleanup to be able to make limit arguments, we end up with formalisms that directly give us interesting properties in the intermediate stage. We may be able to throw away the arguments from limit behavior, and thus stop caring much about the limit or the formalisms we approximate there. This is the sense in which 'the ideal fades into the background'

Reply

[-]abramdemski5y40

Yep, that's a good way to explain it!

Reply

[-]Charlie Steiner5yΩ360

I feel like the logical inductor analogy still has more gas in the tank. Can we further limit the computational power and ask about the finite-time properties of some system that tries to correct its own computationally-tractable systematic errors? I feel like there's some property of "not fooling yourself" that this should help with.

Reply

[-]Ronny Fernandez5y30

This might be sort of missing the point, but here is an ideal and maybe not very useful not-yet-theory of rationality improvements I just came up with.

There are a few black boxes in the theory. The first takes you and returns your true utility function, whatever that is. Maybe it's just the utility function you endorse, and that's up to you. The other black box is the space of programs that you could be. Maybe it's limited by memory, maybe it's limited by run time, or maybe it's any finite state machine with less than 10^20 states, maybe it's python programs less than 5000 characters long, some limited set of programs that takes your sensory data and motor output history as input, and returns a motor output. The limitations could be whatever, don't have to be like this.

Then you take one of these ideal rational agents with your true utility function and the right prior, and you give them the decision problem of designing your policy, but they can only use policies that are in the limited space of bounded programs you could be. Their expected utility assignments over that space of programs is then our measure of the rationality of a bounded agent. You could also give the ideal agent access to your data and see how that changes their ranking, if it does. If you can change yourself such that the program you become is assigned higher expected utility by the agent, then that is an improvement.

Reply

[-]abramdemski5y20

The main thing I want to point out that this is an idealized notion of non-idealized decision theory -- in other words, it's still pretty useless to me as a bounded agent, without some advice about how to approximate it. I can't very well turn into this max-expected-value bounded policy.

But there are other barriers, too. Figuring out what utility function I endorse is a hard problem. And we face challenges of embedded decision theory; how do we reason about the counterfactuals of changing our policy to the better one?

Modulo those concerns, I do think your description is roughly right, and carries some important information about what it means to self-modify in a justified way rather than cargo-culting.

Reply

[-]Cole Wyeth10mo20

We don't want to talk about partial rationality; we want notions of rationality which bounded agents can fully satisfy.

Why expect this kind of thing to exist? It seems to me that the ideas of computational boundedness and optimality are naturally in tension.

Reply

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

55

What does it mean to apply decision theory?

55

Ω 23

55

Ω 23

Approximating Rationality

Examples

Logical Uncertainty

The Ideal Fades into the Background

Example 1: Logical Induction

Example 2: Game-Theoretic Equilibria vs MAL

Counterexample: Complete Class Theorems

Conclusion