Wiki Contributions


I think Vanessa would argue that "Bayesianism" is not really an option. The non-realizability problem in Bayesianism is not just some weird special case, but the normal state of things: Bayesianism assumes that we have hypotheses fully describing the world, which we very definitely don't have in real life. IB tries to be less demanding, and the laws in the agent's hypothesis class don't necessarily need to be that detailed. I am relatively skeptical of this, and I believe that for an IB agent to work well, the laws in its hypothesis class probably also need to be unfeasibly detailed. So both "adopting Bayes" and "adopting infra-Bayes" fully is impossible. We probably won't have such a nice mathematical model for the messy decision process a superintelligence actually adopts, the question is whether thinking about it as an approximation of Bayes or infra-Bayes gives us a more clear picture. It's a hard question, and IB has an advantage in that the laws need to be less detailed, and a disadvantage that I think you are right about it being unnecessarily paranoid. My personal guess is that nothing besides the basic insight of Bayesianism ("the agent seems to update on evidence, sort of following Bayes-rule") will be actually useful in understanding the way an AI will think.

  1. No idea. I don't think it's computationally very tractable. If I understand correctly, l Vanessa hopes there will be computationally feasible approximations, but there wasn't much research into computational complexity yet, because there are more basic unsolved questions.
  2. I'm pretty sure that no. An IB agent (with enough compute) plans for the long run and doesn't go into a chain of deals that leaves it worse of than not doing anything. In general, IB solves the "not exactly Bayesian expected utility maximizer but still can't be Dutch booked problem" by potentially refusing to take either side of a bet: if it has Knightian uncertainty about whether a probability is lower or higher than 50%, it will refuse to bet at even odds either for or against. This is something that humans actually often do, and I agree with Vanessa that a decision theory can be allowed to do that.
  3. I had a paragraph about it:
    "Here is where convex sets come in: The law constrains Murphy to choose the probability distribution of outcomes from a certain set in the space of probability distributions. Whatever the loss function is, the worst probability distribution Murphy can choose from the set is the same as if he could choose from the convex hull of the set. So we might as well start by saying that the law must be constraining Murphy to a convex set of probability distributions."
    As far as I can tell, this is the reason behind considering convex sets. This makes convexity pretty central: laws are very central, and now we are assuming that every law is a convex set in the space of probability distributions.
  4. Vanessa said that her guess is yes. In the terms of the linked Arbital article, IB is intended to be an example of "There could be some superior alternative to probability theory and decision theory that is Bayesian-incoherent". Personally, I don't know, I think that the article's "A cognitively powerful agent might not be sufficiently optimized" possibility feels more likely in the current paradigm, I can absolutely imagine the first AIs to become a world-ending threat not being very coherent. Also, IB is just an ideal, real-world smart agents will be at best approximations  of infra-Bayesian agents (same holds for Bayesianism). Vanessa's guess is that understanding IB better will still give us useful insights into these real-world models if we view them as IB approximations, I'm pretty doubtful, but maybe. Also, I feel that the problem I write about in my post on the monotonicity principle points at some deeper problem in IB which makes me doubtful whether sufficiently optimized agents will actually use (approximations of) the minimax thinking prescribed by IB.

Personally I like Unsong's God, and I think His approach is better than tiling the Universe with copies of the same optimal entity (or copies of an optimal neighborhood where each being can encounter enough diversity to satisfy them in their own neighborhood). 

The Unsong approach might still lead to uncomfortable outcomes with some people tortured to make other people have different positive experiences than the ones already tried (hence the solution to the Problem of evil in Unsong), but I think that with giving big enough negative utilities to suffering, the system probably wouldn't create people with overall very net-negative lives (and maybe put suffering p-zombie robots in the world if that's really necessary for other people having novel positive experiences). These are just my guesses and I'm not confident that we can actually make this right, as I mentioned, I wouldn't want to create any kind of utilitarian sovereign superintelligence. But I think that the weird asymmetry baked in infra-Bayesianism that it can't give negative utility to any event makes the whole problem significantly harder and points at a weakness of IB.

I'm pretty sure that's not how it works. By looking around, it very soon learns that some things are not maximally horrible, like the chair in the room is not broken (so presumably there is some kind of law constraining Murphy to keep the chair intact at least for now). Why would the agent break the chair then, why would that be better than what would happen otherwise?

Okay, maybe I was somewhat unfair in saying there are no results. Sill, I think it's good to distinguish "internal results" and "external results". Take the example of complex analysis: we have many beautiful results about complex holomorphic functions, like Cauchy's integral formula. I call these internal results. But what made complex analysis so widely studied is that it could be used to produce some external results, like calculating the integral under the bell curve or proving the prime number theorem. These are questions that interested people even before holomorphic functions were invented, so proving them gave a legitimacy to the new complex analysis toolkit. Obviously, Cauchy's integral formula and the like are very useful too, as we couldn't reach the external results without understanding the toolkit itself better with the internal results. But my impression is that John was asking for an explanation of the external results, as they are more of an interest in an introductory post.

I count the work on Newcomb as an external result: "What learning process can lead to successfully learning to one-box in Newcomb's game?" is a natural question someone might ask without hearing about infra-Bayesianism, and I think IB gives a relatively natural framework for that (although I haven't looked into this deeply, and I don't know exactly how natural or hacky it is). On the other hand, from the linked results, I think the 1st, 4th and 5th are definitely internal results, I don't understand so can't comment of the 3rd, and the 2nd is Newcomb which I acknowledge. Similarly, I think IBP itself tries to answer an external question (formalizing naturalized induction), but I'm not convinced it succeeds in that, and I think the theorems are mostly internal results, and not something I would count as an external evidence. (I know less about this, so maybe I'm missing something).

In general, I don't deny IB has many internal results, which I acknowledge to be a necessary first step. But I think that John was looking for external results here, and in general my impression is that people seem to believe that there are more external results than there really are (did I mention the time I got a message from a group of young researchers asking if I thought "if it is currently feasible integrating multiple competing scientific theories into a single infra-Bayesian model"?) So I think it' useful to be more clear about that we don't have that many external results.

Thanks for Vanessa for writing this, I find it a useful summary of the goals and directions of LTA, which was sorely missing until now. Readers might also be interested in my write-up A mostly critical review of infra-Bayesianism that tries to give a more detailed explanation about a subset of the questions above, and how much progress there was towards their solutions so far. I also give my thoughts and criticism of Infra-Bayesian Physicalism, the theory on which PSI rests.

I will also edit the post to include a link to this post. So far, I advised people to read Embedded agency for the motivating questions, but now I can recommend this post too.

I still think that the hot stove example is a real problem, although maybe unavoidable. My example starts with "I learned that the hot stove always burns my hand." This is not the exploration part anymore, the agent already observed the stove burning its hand many times. Normally, this would be enough to never touch the hot stove again, but if some unexplained nice things happen in the outside world, there is suddenly no guarantee that the IB agent doesn't start touching the stove again. Maybe this is unavoidable, but I maintain it's a weird behavior pattern that the outside weather might make you lose the guarantee to not touch the stove you already learned is burning. I think it's somewhat different than the truly unavoidable thing, that every agent needs to do some exploration and it sometimes leads to them burning their hand.