Maybe the easiest way to understand UDT and TDT is:
Comparing UDT and TDT directly, the main differences seem to be that UDT does not do Bayesian updating on sensory inputs and does not make use of causality. There seems to be general agreement that Bayesian updating on sensory inputs is wrong in a number of situations, but disagreement and/or confusion about whether we need causality. Gary Drescher put it this way:
Plus, if you did have a general math-counterfactual-solving module, why would you relegate it to the logical-dependency-finding subproblem in TDT, and then return to the original factored causal graph? Instead, why not cast the whole problem as a mathematical abstraction, and then directly ask your math-counterfactual-solving module whether, say, (Platonic) C's one-boxing counterfactually entails (Platonic) $1M? (Then do the argmax over the respective math-counterfactual consequences of C's candidate outputs.)
(Eliezer didn't give an answer. ETA: He did answer a related question here.)
Why haven't SI and LW attracted or produced any good strategists? I've been given to understand (from someone close to SI) that various people within SI have worked on Singularity strategy but only produced lots of writings that are not of an organized, publishable form. Others have attempted to organize them but also failed, and there seems to be a general feeling that strategy work is bogged down or going in circles and any further effort will not be very productive. The situation on LW seems similar, with people arguing in various directions without much feeling of progress. Why are we so bad at this, given that strategic thinking must be a core part of rationality?
I finally decided it's worth some of my time to try to gain a deeper understanding of decision theory...
Question: Can Bayesians transform decisions under ignorance into decisions under risk by assuming the decision maker can at least assign probabilities to outcomes using some kind of ignorance prior(s)?
Details: "Decision under uncertainty" is used to mean various things, so for clarity's sake I'll use "decision under ignorance" to refer to a decision for which the decision maker does not (perhaps "cannot") assign probabilities to some of the possible outcomes, and I'll use "decision under risk" to refer to a decision for which the decision maker does assign probabilities to all of the possible outcomes.
There is much debate over which decision procedure to use when facing a decision under ignorance when there is no act that dominates the others. Some proposals include: the leximin rule, the optimism-pessimism rule, the minimax regret rule, the info-gap rule, and the maxipok rule.
However, there is broad agreement that when facing a decision under risk, rational agents maximize expected utility. Because we have a clearer procedure for dealing w...
You could always choose to manage ignorance by choosing a prior. It's not obvious whether you should. But as it turns out, we have results like the complete class theorem, which imply that EU maximization with respect to an appropriate prior is the only "Pareto efficient" decision procedure (any other decision can be changed so as to achieve a higher reward in every possible world).
This analysis breaks down in the presence of computational limitations; in that case it's not clear that a "rational" agent should have even an implicit representation of a distribution over possible worlds (such a distribution may be prohibitively expensive to reason about, much less integrate exactly over), so maybe a rational agent should invoke some decision rule other than EU maximization.
The situation is sort of analogous to defining a social welfare function. One approach is to take a VNM utility function for each individual and then maximize total utility. At face value it's not obvious if this is the right thing to do--choosing an exchange rate between person A's preferences and person B's preferences feels pretty arbitrary and potentially destructive (just like choosing prior odds between possible world A and possible world B). But as it turns out, if you do anything else then you could have been better off by picking some particular exchange rate and using it consistently (again, modulo practical limitations).
What AlexMennen said. For a Bayesian there's no difference in principle between ignorance and risk.
One wrinkle is that even Bayesians shouldn't have prior probabilities for everything, because if you assign a prior probability to something that could indirectly depend on your decision, you might lose out.
A good example is the absent-minded driver problem. While driving home from work, you pass two identical-looking intersections. At the first one you're supposed to go straight, at the second one you're supposed to turn. If you do everything correctly, you get utility 4. If you goof and turn at the first intersection, you never arrive at the second one, and get utility 0. If you goof and go straight at the second, you get utility 1. Unfortunately, by the time you get to the second one, you forget whether you'd already been at the first, which means at both intersections you're uncertain about your location.
If you treat your uncertainty about location as a probability and choose the Bayesian-optimal action, you'll get demonstrably worse results than if you'd planned your actions in advance or used UDT. The reason, as pointed out by taw and pengvado, is that your probability of arriving at the second intersection depends on your decision to go straight or turn at the first one, so treating it as unchangeable leads to weird errors.
So if you're a Bayesian decision-maker, doesn't that mean that you only ever face decisions under risk, because at they very least you're assigning ignorance priors to the outcomes for which you're not sure how to assign probabilities?
Correct. A Bayesian always has a probability distribution over possible states of the world, and so cannot face a decision under ignorance as you define it. Coming up with good priors is hard, but to be a Bayesian, you need a prior.
This question may come off as a bit off topic : people often say cryonics is a scam. Which is the evidence for that, and to the contrary? How should I gather it?
The thing is, cryonics is a priori awfully suspect. It appeal to one of our deepest motive (not dying), is very expensive, has unusual payment plans, and is just plain weird. So the prior of it being a scam designed to rip us off is quite high. On the other hand, reading about it here, I acquired a very strong intuition that it is not a scam, or at least that Alcor and CI are serious. The problem is, I don't have solid evidence I can tell others about.
Now, I doubt the scam argument is the main reason why people don't buy it. But I'd like to get that argument out of the way.
Alcor: Improperly trained personnel, unkempt and ill-equipped facilities.
...[...] Saul Kent invited me over to his home in Woodcrest, California to view videotapes of two Alcor cases which troubled him – but he couldn’t quite put his finger on why this was so.[...] Patients were being stabilized at a nearby hospice, transported to Alcor (~20 min away) and then CPS was discontinued, the patients were placed on the OR table and, without any ice on their heads, they were allowed to sit there at temperatures a little below normal body temperature for 1 to 1.5 hours, while burr holes were drilled, [...] smoke could be seen coming from the burr wound! Since the patient had no circulation to provide blood to carry away the enormous heat generated by the action of the burr on the bone, the temperature of the underlying bone (and brain) must have been high enough to literally cook an egg. In one case, a patient’s head was removed in the field and, because they had failed to use a rectal plug, the patient had defecated in the PIB. The result was that feces had contaminated the neck wound, and Alcor personnel were seen pouring saline over the stump of the neck whilst holding the patient’s seve
If I understand correctly, I can extract those flags, in descending order of redness:
That also suggest signs of trustworthiness:
I'd like to have more such green and red flags, but this is starting to look actionable. Thank you.
Question: Why don't people talk about Ems / Uploads as just as disastrous as uncontrolled AGI? Has there been work done or discussion about the friendliness of Ems / Uploads?
Details: Robin Hanson seems to describe the Em age like a new industrial revolution. Eliezer seems to, well, he seems wary of them but doesn't seem to treat them like an existential threat. Though Nick Bostrom sees them as an existential threat. A lot of people on Lesswrong seem to talk of it as the next great journey for humanity, and not just a different name for uFAI. For my pa...
Person A and B hold a belief about proposition X.
Person A has purposively sought out, and updated, on evidence related to X since childhood.
Person B has sat on her couch and played video games.
Yet both A and B have arrived at the same degree-of-belief in proposition X.
Does the Bayesian framework equip its adherents with an adequate account of how Person A should be more confident in her conclusion than Person B?
The only viable answer I can think of is that every reasoner should multiply every conclusion with some measure of epistemic confidence, and re-normalize. But I have not yet encountered such a pervasive account of confidence-measurement from leading Bayesian theorists.
When discussing the repugnanat conclusion, Eliezer commented:
...I have advocated that "lives barely worth living" always be replaced with "lives barely worth celebrating" in every discussion of the 'Repugnant' Conclusion, to avoid equilibrating between "lives almost but not quite horrible enough to imply that a pre-existing person should commit suicide despite their intrinsic desire to live" versus "lives which we celebrate as good news upon learning about them, and hope to hear more such news in the future, but only to a
Do you know any game (video or board game, singleplayer or multiplayer, for adults or kids, I'm interested in all) that makes good use of rationality skills, and train them ?
For example, we could imagine a "Trivial Pursuit" game in which you give your answer, and how confident you're in it. If you're confident in it, you earn more if you're right, but you lose more if you're wrong.
Role-playing games do teach quite some on probabilities, it helps "feel" what is a 1% chance, or what it means to have higher expectancy but higher deviation. Card games like poker probably do too, even if I never played much poker.
The board game "Wits and Wagers" might qualify for what you are looking for. Game play is roughly as follows: A trivia question is asked and the answer is always a number (e.g., "How many cups of coffee does the average American drink each year?", "How wide, in feet, is an American football field?"). All the players write their estimate on a slip of paper and then then they are arranged in numerical order on the board. Everybody then places a bet on the estimate they like the best (it doesn't have to be your own). The estimates near the middle have a low payback (1:1, 2:1) and the estimates near the outside have a larger payback (4:1). If your estimate is closest to the actual number or if you bet on that one, will get a payback on your bet.
In the discussion about AI-based vs. upload-based singularities, and the expected utility of pushing for WBE (whole-brain emulation) first, has it been taken into account that an unfriendly AI is unlikely to do something worse than wiping out humanity, while the same isn't necessarily true in an upload-based singularity? I haven't been able to find discussion of this point, yet (unless you think that Robin's Hardscrapple Frontier scenario would be significantly worse than nonexistence, which it doesn't feel like, to me).
[ETA: To be clear, I'm not trying to...
WRT CEV: What happens if my CEV is different than yours? What's the plan for resolving differences between different folks' CEVs? Does the FAI put us all in our own private boxes where we each think we're getting our CEVs, take a majority vote, or what?
Is there a complete list of known / theoretical AI risks anywhere? I searched and couldn't find one.
I can see how the money pump argument demonstrates the irrationality of an agent with cyclic preferences. Is there a more general argument that demonstrates the irrationality of an agent with intransitive preferences of any kind (not merely one with cyclic preferences)?
A little bit of googling turned up this paper by Gustafsson (2010) on the topic, which says that indifference allows for intransitive preferences that do not create a strict cycle. For instance, A>B, B>C, and C=A.
The obvious solution is to add epsilon to break the indifference. If A>B, then there exists e>0 such that A>B+e. And if e>0 and C=A, then C+e>A. So A>B+e, B+e>C+e, and C+e>A, which gives you a strict cycle that allows for money pumping. Gustafsson calls this the small-bonus approach.
Gustafsson suggests an alternative, using lotteries and applying the principle of dominance. Consider the 4 lotteries:
Lottery 1: heads you get A, tails you get B
Lottery 2: heads you get A, tails you get C
Lottery 3: heads you get B, tails you get A
Lottery 4: heads you get C, tails you get A
Lottery 1 > Lottery 2, because if it comes up tails you prefer Lottery 1 (B>C) and if it comes up heads you are indifferent (A=A).
Lottery 2 > Lottery 3, because if it comes up heads you prefer Lottery 2 (A>B) and if it comes up tails you are indifferent (C=A)
Lottery 3 > Lottery 4, because if it comes up heads you prefer Lottery 3 (B>C) and if it comes up tails you are indifferent (A=A)
Lottery 4 > Lottery 1, because if it comes up tails you prefer Lottery 4 (A>B) and if it comes up heads you are indifferent (C=A)
Don't know if this has been answered, or where to even look for it, but here goes.
Once FAI is achieved and we are into the Singularity, how would we stop this superintelligence from rewriting its "friendly" code to something else and becoming unfriendly?
We wouldn't. However, the FAI knows that if it changed its code to unFriendly code, then unFriendly things would happen. It's Friendly, so it doesn't want unFriendly things to happen, so it doesn't want to change its code in such a way as to cause those things - so a proper FAI is stably Friendly. Unfortunately, this works both ways: an AI that wants something else will want to keep wanting it, and will resist attempts to change what it wants.
There's more on this in Omohundro's paper "Basic AI Drives"; relevant keyword is "goal distortion". You can also check out various uses of the classic example of giving Gandhi a pill that would, if taken, make him want to murder people. (Hint: he does not take it, 'cause he doesn't want people to get murdered.)
Dragging up anthropic questions and quantum immortality: suppose I am Schrodinger's cat. I enter the box ten times (each time it has a .5 probability of killing me), and survive. If I started with a .5 belief in QI, my belief is now 1024/1025.
But if you are watching, your belief in QI should not change. (If QI is true, the only outcome I can observe is surviving, so P_me(I survive | QI) = 1. But someone else can observe my death even if QI is true, so P_you(I survive | QI) = 1/1024 = P_you(I survive | ~QI).)
Aumann's agreement theorem says that if we share ...
The use of external computation (like a human using a computer to solve a math problem or an AI expanding its computational resources) is a special case of inferring information about mathematical statements from your observations about the universe.
What is the general algorithm for accomplishing this in terms of pure observations (no action, observation cycles)? How does the difficulty of the mathematical statements you can infer to be probably true relate to the amount of computation you have expended approximating solomonoff induction?
I'm a bit late on this, obviously, but I've had a question that I've always felt was a bit too nonsensical (and no doubt addressed somewhere in the sequences that I haven't found) to bring up but it kinda bugs me.
Do we have any ideas/guesses/starting points about whether or not "self-awareness" is some kind of weird quirk of our biology and evolution or if would be be an inevitable consequence of any general AI?
I realize that's not a super clear definition- I guess I'm talking about that feeling of "existing is going on here" and you c...
The climactic realization is gung vzzrqvngr rkcrevrapr vf havgnel, ohg gur zvaq dhvpxyl qvivqrf vg vagb jung vf vafvqr gur frys naq bhgfvqr gur frys.
...That's the sound made by a poorly maintained motorcycle.
Question on posting norms: What is the community standard for opening a discussion thread about an issue discussed in the sequences? Are there strong norms regarding minimum / maximum length? Is formalism required, or frowned on, or just optional? Thanks
Say you start from merely the axioms of probability. From those, how do you get to the hypothesis that "the existence of the world is probable"? I'm curious to look at it in more detail because I'm not sure if it's philosophically sound or not.
Has Eliezer written about what theory of meaning he prefers? (Or does anyone want to offer a guess?)
I've also been doing searches for topics related to the singularity and space travel (this thought came up after playing a bit of Mass Effect ^ _ ^). It would seem to me that biological restrictions on space travel wouldn't apply to a sufficiently advanced AI. This AI could colonize other worlds using near speed of light travel with minimal physical payload and harvest the raw materials on some new planet using algorithms programmed in small harvesting bots. If this is possible then it seem to me that unfriendly AI might not be that much of a threat since ...
Previously: round 1, round 2, round 3
From the original thread:
Ask away!