Probabilistic Löb theorem

[-][anonymous]13y40

[This comment is no longer endorsed by its author]Reply

Even though Loeb's theorem can't be derived by a translation of the standard proof with two epsilon large bounds, might there still be a different proof?

Again, Christiano et al supposedly prove the existence of a coherent distribution with certain properties. (Someone who knows game theory better than I do should work out what the proof depends on.) Any such distribution necessarily violates the probabilistic Loeb's theorem, for roughly the reason given in the OP's second-to-last section.

Is there an intuitive justification for the two epsilon large bound other than that it stops Loeb's theorem from being derived?

Not as far as this lay-reader can see. Again, the distribution must satisfy derivation principle #1. I can't tell if it must obey #3, though if it does that would seem to rule out a stronger version of #2.

[-]Cyan13y20

Do you folks at MIRI think this result is as significant an advance toward FAI as, say, Cox's theorem or Pearl-style causal inference?

[-]jsteinhardt13y100

I'm not at MIRI but I was at part of the workshop and my answer would be no (although I do think the result is worthwhile).

Although I would also place Cox's theorem and graphical models on pretty different standing; the latter seems much more important to me than the former.

[-]CarlShulman13y30

Yes, Cox's result gives a different supporting argument for the use of probability, but didn't introduce new ways of doing probabilistic reasoning, whereas graphical models have had major applications in efficient probabilistic reasoning in practice.

[-]Stuart_Armstrong13y40

This result isn't on its own (it's a possibility/impossibility result about a probability system that may not exist) - but if a variant of Paul's probability can work, then that would be significant result.

[-]hairyfigment13y20

P('¬F') ≥ a. Since probabilities sum to one, the system can prove that P('F') < 1-a.

You mean, because we can always replace the first "a" with some b>a?

This seems like a very cool - and bizarre - result.

[-]Stuart_Armstrong13y00

Sorry, a bit sloppy with the > and ≥ - since you can always choose a slightly larger "a", it still works out. I've corrected the text.

[-]AlexMennen13y10

And yet F is not only false, the system can disprove it!

Maybe this would be obvious if I knew anything about logic, but how do you know the system is consistent?

[-]hairyfigment13y00

The result linked at the beginning shows that there exists, in principle, a coherent probability distribution with certain properties. Edit: in particular, it assigns probability 0 to F or any other contradiction. And while it doesn't always (ever?) know the exact probability it assigns, it does know that P(F)<1-a for any a<1. That statement itself has probability 1. Therefore the part about violating the probabilistic Lob's Theorem clearly holds.

I can't tell at a glance if the distribution satisfies derivation principle #3, but it certainly satisfies #1.

[-]Stuart_Armstrong13y00

We don't - generally we build systems where we can show "system X is consistent iff Peano Arithmetic is consistent". And we assume that PA is consistent (or we panic).

[-]AlexMennen13y10

Sorry my phrasing was bad; I actually do know that much about logic. But how do you know that this system is consistent iff Peano Arithmetic is consistent?

[-]Stuart_Armstrong13y00

We don't have that system yet! Just that that is what we generally do with the systems we have.

[-]Adele Lopez13y00

So assuming that the Löb's theorem problem is actually solved, the next step is to define an ideal AI which can self modify without weakening its mathematical system? Is it clear how to do this once the Löb obstruction is dealt with, or is this another very difficult math problem?

[-]Stuart_Armstrong13y00

Nope, it's pretty much done - you can trust your future copies probabilistically, without weakening. However, that would require a P that's both defined and computable - not a trivial task.

[-][anonymous]13y00

What does P mean here?

[This comment is no longer endorsed by its author]Reply

[-]redxaxder13y00

Why would you leave out quantifiers? Requiring the reader to stick their own existential or universal quantification in the necessary places isn't very nice.

Is this the correct interpretation of your assumptions? If not, what is it? I am not interested in figuring out which axioms are required to make your proof (which is also missing quantifiers) work.

If ⊢ A, then ⊢ ∀(a < 1)(□a A).
⊢ □aA → ∀(c < 1)(∃(b > 0)(□c□a+b A)).
⊢ □a(A → B) → ∃(b > 0)(□bA → □a+bB).

[-]Larks13y30

" So the following derivation principles seem reasonable, where the latin indexes (a,b,c...) are meant to represent numbers that can be arbitrarily close to zero"

so universally quantified, but in the meta language.

[-]redxaxder13y00

Thank you.

[-]Stuart_Armstrong13y00

As Larks said, we can quantify (the meta language looking in), but the system itself can't quantify. Because then the system could reason that "∀x>0, P(A)<x" means "P(A)=0", which is the kind of thing that causes bad stuff to happen. Here, the system can show "P(A)<x" separately for any given x>0, but can't prove the same statement with the universal quantifier.

[-]redxaxder13y00

Is it unreasonable of me to be annoyed at that kind of writing?

If I understand what's going on correctly, you have a real-indexed schema of axioms and each of them is in your system.

When I read the axiom list the first time I saw that the letters were free variables (in the language you and I are writing in) and assumed that you did not intend for them to be free variables in the formula. My suggestion of how to bind the variables (in the language we are writing in) was (very) wrong, but I still think that it's unclear as written.

Am I confused?

[-]Stuart_Armstrong13y00

Is it unreasonable of me to be annoyed at that kind of writing?

No, it's perhaps not the best explained post I've done. Though it was intended more for technical purposes.

Am I confused?

Not any more, I hope!

[-]ShardPhoenix13y00

I'm still a bit vague on this Löb business. Is it a good thing or a bad thing (from a creating AI perspective) that "Löb's theorem fails" for the probabilistic version?

edit: The old post linked suggests that it's good that it doesn't apply here.

[-]Qiaochu_Yuan13y80

It's a good thing. Löb's theorem is an obstacle (the "Löbstacle," if you will).

[-]Stuart_Armstrong13y30

Löb's theorem's means an agent cannot trust future copies of itself, or simply identical copies of itself, to only prove true statements.

[-]jsteinhardt13y10

Er I don't think this is right. Lob's theorem says that an agent cannot trust future copies of itself, unless those future copies use strictly weaker axioms in their reasoning system.

[-]Stuart_Armstrong13y10

The "can" has now been changed into "cannot". D'oh!

[-]hairyfigment13y00

Good for creating AGI, maybe bad for surviving it. Hopefully the knowledge will also help us predict the actions of strong self-modifying AI.

It does seem promising to this layman, since it removes the best reason I could imagine for considering that last goal impossible.

[-]Decius13y00

It seems like you use (1-a-b) when what seems right to me is ((1-a)(1-b)).

That's important, because I can construct an infinite series such that the sum of that series is smaller than epsilon, but not an infinite series of terms edit: all of which are between 0 and 1 with a product that converges to anything but 0.

Plus, any system capable of proving a contradiction can disprove it; the ability to prove that 'no statement which implies a contradiction is provable' is almost the equivalent of the claim that 'all provable statements are true'.

[-]jsteinhardt13y30

That's important, because I can construct an infinite series such that the sum of that series is smaller than epsilon, but not an infinite series of terms >1 with a product that converges.

Let a(n) be any series whose sum converges and let b(n) = exp(a(n)). Then the product of b(n) converges iff the sum of a(n) converges. In particule, we can take b(n) = exp(1/2^n) for n = 0 to infinity, whose product converges to exp(2).

[-]Decius13y00

My claim was intended to be limited to infinite products where each term is between 0 and 1.

exp(1/2^0)=exp(1)=e>1
exp(epsilon)>1
b(n)>1 for all positive n.

The addition (or removal) of a finite number of nonzero terms at the beginning of an infinite product does not change whether or not the product converges.

Plain language: if you remove enough of the first terms of an infinite product that converges, the product of remaining terms must converge to 1. I'm rusty on my math and LaTeX, but given an infinite product that converges, we can prove that the first N terms have a product within epsilon of the product of the series (by the definitions of limit and converge), which means that the remaining infinite number of terms must have a product within some different epsilon of 1 (again, simply by definitions).

Given an infinite series of numbers all between 0 and 1 and dropping the first N terms, I can find an epsilon such that the product of the remaining terms is always greater than that epsilon from 1. That epsilon is 1-(the N+1th term).

I've made some error somewhere, since the infinite product of b(n)=.01 converges (to 0). I intuit that I may have proven only that the infinite product does not converge to 1. I suspect that there's a step somewhere around the point where I used the phrase 'some different epsilon'.

[-]Kindly13y10

if you remove enough of the first terms of an infinite product that converges, the product of remaining terms must converge to 1

This is not sufficiently precise. More precisely: if T(n) is the tail product of all but the first n terms, then for each fixed n, T(n) is a convergent product, and as n approaches infinity, T(n) should to converge to 1 (assuming the initial infinite product converges to a nonzero value).

I think the flaw in your proof is the confusion between convergence of T(n) for a fixed n, as an infinite product, and convergence of T(n), the value of that infinite product, as n goes to infinity.

Also, a more general family of counterexamples is the following: take any nonnegative series a(n) whose sum converges to A, and let b(n) = exp(-a(n)). Then 0<b(n)<1 and the product of b(n) converges to exp(-A). (Obviously this is more or less recycled from the grandparent.) For example, b(n) = exp(-1/2^n).

[-]Decius13y00

Concur- the infinite product of 0<b(n)<1 is in [0,1). That makes sense for probabilities.

[-][anonymous]13y00

Here's a counterexample to your claim:

Let a(n) be a decreasing series which tends to a nonzero limit (for example, a(n) = 1 + 1/n)

Then let b(0) = a(0), b(n) = a(n) / a(n-1)
a(n) is decreasing so for all n>0, b(n) < 1

But the product of the first N b(n)s is a(N), so the infinite product of b(n) converges to the same thing as a(n), which is nonzero.

[-]Decius13y00

So, a first term greater than 1, and the remainder converges to 1/b(0)?

I did only show that the infinite product of 0<n<1 cannot be one. It can be 1-epsilon for any epsilon.

I don't think that particular a(n) converges, but that doesn't invalidate your point that b(n) selected such that the product from 0 to n equals the sum of a(n) from 0 to n must have a product that converges to the sum of a(n). But the sum of a(n) would have to be decreasing for the terms of b(n) to be between 0 and 1.

[-][anonymous]13y00

I wasn't summing a(n). It's the sequence a(n) that converges, not its sum, and the partial products of the b(n) are equal to the a(n), not the partial sums of the a(n).

Certainly an infinite product of 0<n<1 can't be one. Nobody's disputing that.

[-]Decius13y00

Oh, then it sounds like we are in perfect agreement that my initial claim was wrong; however, we can now generate an infinite series of probabilities less than one whose product remains higher than 1-epsilon for any epsilon. If 1-epsilon is used as the determinator of what the system proves true, Löb's theorem holds.

[-]Stuart_Armstrong13y00

(1-a)(1-b) = 1-a-b+ab > 1-a-b.

Hence P(A) > (1-a)(1-b) implies P(A) > (1-a-b). Since I let these quantities get arbitrarily close to zero, the quadratic difference term doesn't matter.

[-]Decius13y00

P(A) ≥ (1-a)(1-b) U (a,b)>0 implies P(A) > (1-a-b). That's a (very slightly) stronger form. I noticed what I thought was an error where you were adding improbabilities together instead of multiplying.

[-]Stuart_Armstrong13y00

The proof doesn't need the stronger form, however.

A)	⊢ □_c (L → (□_bL → Q))	1st derivation, def of L
B)	⊢ □_d L → □_d+c(□_bL → Q)	3rd derivation, A)
C)	⊢ □_d L → (□_e□_bL → □_d+c+eQ)	3rd derivation, B)
D)	⊢ □_d L → (□_e□_d+fL)	2rd derivation
E)	⊢ □_d L → □_d+c+eQ	C), D), setting d+f=b
F)	⊢ □_d L → Q	E), def of Q, setting d+c+e=a

G)	⊢ L	F), def of L, with d=b
H)	⊢ □_b L	1st derivation, G)
I)	⊢ Q	3rd derivation, B)

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

51

Probabilistic Löb theorem

51

51

Löb's proof fails

Löb's theorem fails

The last echoes of Löb's theorem