What Would I Do? Self-prediction in Simple Algorithms

[-]Alexei5y100

Do you find the solutions to problems like this manually or are there tools out there that can help or completely solve it?

[-]Scott Garrabrant5y60

Somewhere in between? I have reliable intuition about what would happen that comes before being able to construct the proof, but can reliably be turned into the proof. All of the proofs that these agents do what I say they do can be found by asking:

Assume that the probability does not converge as I say it does. How can I use this to make money if I am allowed to see (continuously) the logical inductors beliefs, and bet against them?

For example in the first example, If the probability was greater that infinity often, I could wait until the probability is greater than $1 / 2 + δ$ , then bet that the agent goes right. This bet will always pay out, and double my money, and I can do this forever.

[-]orthonormal5y60

Re: Ben's question but much lower-level, there's some extent to which a logical inductor keeps having to accept actual failures sometimes as the price of preventing spurious counterfactuals, where in our world we model the consequences of certain actions without ever doing them.

It's a free lunch type of thing; we assume our world has way more structure than just "the universe is a computable environment", so that we can extrapolate reliably in many cases. Strictly logically, we can't assume that- the universe could be set up to violate physics and reward you personally if you dive into a sewer, but you'll never find that out because that's an action you won't voluntarily take.

So if the universe appears to run by simple rules, you can often do without exploration; but if you can't assume it is and always will be run by those rules, then you need to accept failures as the price of knowledge.

[-]Ben Pace5y*40

Something about this feels compelling... I need to do some empiricism to understand what my counterfactuals are. By the time a real human gets to the 5-and-10 problem they’ve done enough, but I’d you just appear in a universe and it‘s your first experience, I’m not too surprised you need to actually check these fundamentals.

(I’m not sure if this actually matches up philosophically with the logical inductors.)

[-]ESRogs5y50

One point is that, while here we're doing ε-exploration, there are ways to do better, such that the rate at which you explore is density zero, such that you explore less and less for large n

Was wondering about this as I was reading. Can I just have ε be ? Does that cause ε to shrink too fast s.t. that I don't actually explore?

In general, what are the limits on how fast ε can shrink, and how does that depend on program complexity, or the number of output options?

[-]Charlie Steiner5y60

From a logic perspective you'd think any epsilon>0 would be enough to rule out the "conditioning on a falsehood" problem. But I second your question, because Scott makes it sound like there's some sampling process going on that might actually need to do the thing. Which is weird - I thought the sampling part of logical inductors was about sampling polynomial-time proofs, which don't seem like they should depend much on epsilon.

[-]Scott Garrabrant5y50

Having be $1 / 2^{n}$ won't work.

Surprisingly, having $ε$ go to 0 at any quickly computably rate won't work. For example, if $ε = 1 / n$ you could imagine having a logical induction built out of a collection of traders where one trader has almost all the money and says that on days of the form $A (m)$ , utility conditioned on going left is 0 (where $A$ is a fast growing function). Then, you have a second trader that forces the probability on day $A (m)$ of the statement that the agent goes left to be slightly higher that $1 / A (m)$ . Finally, you have a third trader that forces the expected utility conditioned on right to be very slightly above 0 on days of the form $A (m)$ .

The first trader never loses money, since the condition is never met. The second trader only loses a bounded amount of money, since it is forcing the probability of a sentence that will be false to be very small. The third trader similarly only loses a bounded amount of money. The exploration clause will never trigger, and the agent will never go left on any day of the form $1 / A (m)$ .

The issue here is that we need to not only explore infinitely often, we need to explore infinitely often on all simple subsets of days, if the probability goes to 0 slowly, you can just look at a subset of days that is sufficiently sparse.

There are ways around this that allow us to make a logical induction agent that explores with destiny 0 (meaning that the limit as $N$ goes to infinity of the proportion of days $n < N$ that the agent explores is 0). This is done by explicitly exploring infinitely often on every quickly computable subset of days, while still having the probability of exploring go to 0.

[-]Daniel Kokotajlo5y40

Noob question: If Pn(An=="left") --> 1/2, doesn't that mean that as n gets bigger, Pn(An=="left") gets closer and closer to 1/2? And doesn't that leave open the question of how it approaches 1/2? It could approach from above, or below, or it could oscillate above and below. Given that you say it converges to randomly choosing, I'm guessing it's the oscillation thing. So is there some additional lemma you glossed over, about the way it converges?

[-]Scott Garrabrant5y60

It does not approach it from above or below. As goes to infinity, the proportion of $n < N$ for which $A_{n}$ =="Left" need not converge to 1/2, but it must have 1/2 as a limit point, so the proportion of $n < N$ for which $A_{n}$ =="Left" is arbitrarily close to 1/2 infinitely often. Further, the same is true for any easy to compute subsequence of rounds.

So, unfortunately it might be that $A_{n}$ goes left many many times in a row e.g. for all $n$ between $10^{10}$ and $10^{100}$ , but it will still be unpredictable, just not locally independent.

[-]Pattern5y20

However, it turns out this algorithm might not choose left — because if it always chooses right, it might not have well calibrated probabilities about what would happen if it chose left instead.

The ambiguity between choosing right (as in well*) and choosing right (the direction) is slightly confusing here.

*Which would be the direction left.

It also seems like a proving program would be able to get around needing to explore if it had access to the source code for a scenario.