AI alignment researcher supported by MIRI and LTFF. Working on the learning-theoretic agenda. Based in Israel. See also LinkedIn.
E-mail: vanessa DOT kosoy AT {the thing reverse stupidity is not} DOT org
The topic of this thread is: In naive MWI, it is postulated that all Everett branches coexist. (For example, if I toss a quantum fair coin times, there will be branches with all possible outcomes.) Under this assumption, it's not clear in what sense the Born rule is true. (What is the meaning of the probability measure over the branches if all branches coexist?)
Your reasoning is invalid, because in order to talk about updating your beliefs in this context, you need a metaphysical framework which knows how to deal with anthropic probabilities (e.g. it should be able to answer puzzles in the vein of the anthropic trilemma according to some coherent, well-defined mathematical rules). IBP is such a framework, but you haven't proposed any alternative, not to mention an argument for why that alternative is superior.
The problem is this requires introducing a special decision-theory postulate that you're supposed to care about the Born measure for some reason, even though Born measure doesn't correspond to ordinary probability.
Not sure what you mean by "this would require a pretty small universe".
If we live in naive MWI, an IBP agent would not care for good reasons, because naive MWI is a "library of babel" where essentially every conceivable thing happens no matter what you do.
Also not sure what you mean by "some sort of sampling". AFAICT, quantum IBP is the closest thing to a coherent answer that we have, by a significant margin.
The solution is here. In a nutshell, naive MWI is wrong, not all Everett branches coexist, but a lot of Everett branches do coexist s.t. with high probability all of them display expected frequencies.
My model is that the concept of "morality" is a fiction which has 4 generators that are real:
Is it possible to replace the maximin decision rule in infra-Bayesianism with a different decision rule? One surprisingly strong desideratum for such decision rules is the learnability of some natural hypothesis classes.
In the following, all infradistributions are crisp.
Fix finite action set and finite observation set . For any and , let
be defined by
In other words, this kernel samples a time step out of the geometric distribution with parameter , and then produces the sequence of length that appears in the destiny starting at .
For any continuous[1] function , we get a decision rule. Namely, this rule says that, given infra-Bayesian law and discount parameter , the optimal policy is
The usual maximin is recovered when we have some reward function and corresponding to it is
Given a set of laws, it is said to be learnable w.r.t. when there is a family of policies such that for any
For we know that e.g. the set of all communicating[2] finite infra-RDPs is learnable. More generally, for any we have the learnable decision rule
This is the "mesomism" I taked about before.
Also, any monotonically increasing seems to be learnable, i.e. any s.t. for we have . For such decision rules, you can essentially assume that "nature" (i.e. whatever resolves the ambiguity of the infradistributions) is collaborative with the agent. These rules are not very interesting.
On the other hand, decision rules of the form are not learnable in general, and so are decision rules of the form for monotonically increasing.
Open Problem: Are there any learnable decision rules that are not mesomism or monotonically increasing?
A positive answer to the above would provide interesting generalizations of infra-Bayesianism. A negative answer to the above would provide an interesting novel justification of the maximin. Indeed, learnability is not a criterion that was ever used in axiomatic constructions of decision theory[3], AFAIK.
We can try considering discontinuous functions as well, but it seems natural to start with continuous. If we want the optimal policy to exist, we usually need to be at least upper semicontinuous.
There are weaker conditions than "communicating" that are sufficient, e.g. "resettable" (meaning that the agent can always force returning to the initial state), and some even weaker conditions that I will not spell out here.
I mean theorems like VNM, Savage etc.
First, given nanotechnology, it might be possible to build colonies much faster.
Second, I think the best way to live is probably as uploads inside virtual reality, so terraforming is probably irrelevant.
Third, it's sufficient that the colonists are uploaded or cryopreserved (via some superintelligence-vetted method) and stored someplace safe (whether on Earth or in space) until the colony is entirely ready.
Fourth, if we can stop aging and prevent other dangers (including unaligned AI), then a timeline of decades is fine.
I don't know whether we live in a hard-takeoff singleton world or not. I think there is some evidence in that direction, e.g. from thinking about the kind of qualitative changes in AI algorithms that might come about in the future, and their implications on the capability growth curve, and also about the possibility of recursive self-improvement. But, the evidence is definitely far from conclusive (in any direction).
I think that the singleton world is definitely likely enough to merit some consideration. I also think that some of the same principles apply to some multipole worlds.
Commit to not make anyone predictably regret supporting the project or not opposing it" is worrying only by omission -- it's a good guideline, but it leaves the door open for "punish anyone who failed to support the project once the project gets the power to do so".
Yes, I never imagined doing such a thing, but I definitely agree it should be made clear. Basically, don't make threats, i.e. don't try to shape others incentives in ways that they would be better off precommitting not to go along with it.
Sort of obvious but good to keep in mind: Metacognitive regret bounds are not easily reducible to "plain" IBRL regret bounds when we consider the core and the envelope as the "inside" of the agent.
Assume that the action and observation sets factor as A=A0×A1 and O=O0×O1, where (A0,O0) is the interface with the external environment and (A1,O1) is the interface with the envelope.
Let Λ:Π→□(Γ×(A×O)ω) be a metalaw. Then, there are two natural ways to reduce it to an ordinary law:
However, requiring low regret w.r.t. neither of these is equivalent to low regret w.r.t Λ:
Therefore, metacognitive regret bounds hit a "sweep spot" of stength vs. feasibility which produces a genuinely more powerful agents than IBRL[1].
More precisely, more powerful than IBRL with the usual sort of hypothesis classes (e.g. nicely structured crisp infra-RDP). In principle, we can reduce metacognitive regret bounds to IBRL regret bounds using non-crsip laws, since there's a very general theorem for representing desiderata as laws. But, these laws would have a very peculiar form that seems impossible to guess without starting with metacognitive agents.