- No idea. I don't think it's computationally very tractable. If I understand correctly, l Vanessa hopes there will be computationally feasible approximations, but there wasn't much research into computational complexity yet, because there are more basic unsolved questions.
- I'm pretty sure that no. An IB agent (with enough compute) plans for the long run and doesn't go into a chain of deals that leaves it worse of than not doing anything. In general, IB solves the "not exactly Bayesian expected utility maximizer but still can't be Dutch booked problem" by pot

24mo

Regarding 4: given that infra-Bayesianism is maximally paranoid, shouldn't it
have lower performance relative to decision-making theories like regular Bayes
under many non-adversarial conditions? If the training set does not contain many
instances of adversarial information, then shouldn't we expect agents to adopt
Bayes instead of infra-Bayes?

Personally I like Unsong's God, and I think His approach is better than tiling the Universe with copies of the same optimal entity (or copies of an optimal neighborhood where each being can encounter enough diversity to satisfy them in their own neighborhood).

The Unsong approach might still lead to uncomfortable outcomes with some people tortured to make other people have different positive experiences than the ones already tried (hence the solution to the Problem of evil in Unsong), but I think that with giving big enough negative utilities to suffe...

I'm pretty sure that's not how it works. By looking around, it very soon learns that some things are not maximally horrible, like the chair in the room is not broken (so presumably there is some kind of law constraining Murphy to keep the chair intact at least for now). Why would the agent break the chair then, why would that be better than what would happen otherwise?

Okay, maybe I was somewhat unfair in saying there are no results. Sill, I think it's good to distinguish "internal results" and "external results". Take the example of complex analysis: we have many beautiful results about complex holomorphic functions, like Cauchy's integral formula. I call these internal results. But what made complex analysis so widely studied is that it could be used to produce some external results, like calculating the integral under the bell curve or proving the prime number theorem. These are questions that interested people even b...

85mo

I partially agree, but the distinction between "internal" and "external" results
is more fuzzy and complicated than you imply. Ultimately, it depends on the
original problem you started with. For example, if you only care about prime
numbers, then most results of complex analysis are "internal", with the
exception of results that imply something about the distribution of prime
numbers. However, if complex functions are a natural way to formalize the
original problem, then the same results become "external".
In our case, the original problem is "creating a mathematical theory of
intelligent agents". (Or rather, the problem is "solving AI alignment", or
"preventing existential risk from AI", or "creating a flourishing future for
human civilization", but let's suppose that the path from there to "creating a
mathematical theory of intelligent agents" is already clear; in any case that's
not related specifically to IB.) Infra-Bayesianism is supposed to be an actual
ingredient in this theory of agents, not just some tool brought from the
outside. In this sense, it already starts out as somewhat "external".
To give a concrete example, you said that results about IB multi-armed bandits
are "internal". While I agree that these results are only useful as very
simplistic toy models, they are potentially necessary steps towards stronger
regret bounds in the future. At what point does it become "external"? Taking it
to the extreme, I can imagine regret bounds so powerful, that they would serve
as substantial evidence that an algorithm satisfying them is AGI or close to
AGI. Would such a result still be "internal"?! Arguably not, because AGI
algorithms are very pertinent to what we're interested in!
You can also take the position that any result without direct applications to
existing, practical, economically competitive AI systems is "internal". In such
case, I am comfortable with a research programme that only has "internal"
results for a long time (although not everyone wou

5mo30

Thanks for Vanessa for writing this, I find it a useful summary of the goals and directions of LTA, which was sorely missing until now. Readers might also be interested in my write-up A mostly critical review of infra-Bayesianism that tries to give a more detailed explanation about a subset of the questions above, and how much progress there was towards their solutions so far. I also give my thoughts and criticism of Infra-Bayesian Physicalism, the theory on which PSI rests.

I will also edit the post to include a link to this post. So far, I advised people ...

I still think that the hot stove example is a real problem, although maybe unavoidable. My example starts with "I learned that the hot stove always burns my hand." This is not the exploration part anymore, the agent already observed the stove burning its hand many times. Normally, this would be enough to never touch the hot stove again, but if some unexplained nice things happen in the outside world, there is suddenly no guarantee that the IB agent doesn't start touching the stove again. Maybe this is unavoidable, but I maintain it's a weird behavior patte...

27mo

Well, it's true that the IB regret bound doesn't imply not touching the hot
stove, but. When I think of a natural example of an algorithm satisfying an IB
regret bound, it is something like UCB: every once in a while it chooses a
hypotheses which seems plausible given previous observations and follows its
optimal policy. There is no reason such an algorithm would touch the hot stove,
unless there is some plausible hypothesis according to which it is beneficial to
touch the hot stove... assuming the optimal policy it follows is "reasonable":
see definition of "robustly optimal policy" below.
The interesting question is, can we come up with a natural formal desideratum
strictly stronger than IB regret bounds that would rule out touching the hot
stone. Some ideas:
* Let's call a policy "robustly optimal" for an ultra-POMDP if it is the limit
of policies optimal for this ultra-POMDP with added noise (compare to the
notion of "trembling game equilibrium").
* Require that your learning algorithm converges to the robustly optimal policy
for the true hypothesis. Converging to the exact optimal policy is a
desideratum that is studied in the RL theory literature (under the name
"sample complexity").
* Alternatively, require that your learning algorithm converges to the robustly
optimal policy for a hypothesis close to the true hypothesis under some
metric (with the distance going to 0).
* Notice that this desideratum implies an upper regret bound, so it is strictly
stronger.
* Conjecture: the robustly Bayes-optimal policy has the property above, or at
least has it whenever it is possible to have it.

I think Vanessa would argue that "Bayesianism" is not really an option. The non-realizability problem in Bayesianism is not just some weird special case, but the normal state of things: Bayesianism assumes that we have hypotheses fully describing the world, which we very definitely don't have in real life. IB tries to be less demanding, and the laws in the agent's hypothesis class don't necessarily need to be that detailed. I am relatively skeptical of this, and I believe that for an IB agent to work well, the laws in its hypothesis class probably also nee... (read more)