wgd

50

Note that the original question wasn't "Is it right for a pure altruist to have children?", it was "Would a pure altruist have children?". And the answer to that question most definitely depends on the beliefs of the altruist being modeled. It's also a more useful question, because it leads us to explore which beliefs matter and how they effect the decision (the alternative being that we all start arguing about our personal beliefs on all the relevant topics).

40

This sounds like a sufficiently obvious failure mode that I'd be extremely surprised to learn that modern index funds operate this way, unless there's some worse downside that they would encounter if their stock allocation procedure was changed to not have that discontinuity.

00

I think the important insight you may be missing is that the AI, if intelligent enough to recursively self-improve, can predict what the modifications it makes will do (and if it can't, then it doesn't make that modification because creating an unpredictable child AI would be a bad move according to almost any utility function, even that of a paperclipper). And it evaluates the suitability of these modifications using its utility function. So assuming the seed AI is build with a sufficiently solid understanding of self-modification and what its own code is doing, it will more or less automatically work to create more powerful AIs whose actions will also be expected to fulfill the original utility function, no "fixed points" required.

There is a hypothetical danger region where an AI has sufficient intelligence to create a more powerful child AI, isn't clever enough to predict the actions of AIs with modified utility functions, and isn't self-aware enough to realize this and compensate by, say, not modifying the utility function itself. Obviously the space of possible minds is sufficiently large that there exist minds with this problem, but it probably doesn't even make it into the top 10 most likely AI failure modes at the moment.

30

Could someone explain the reasoning behind answer A being the correct choice in Question 4? My analysis was to assume that, since 30 migraines a year is still pretty terrible (for the same reason that the difference in utility between 0 and 1 migraines per year is larger than the difference between 10 and 11), I should treat the question as asking "Which option offers more migraines avoided per unit money?"

```
Option A: $350 / 70 migraines avoided = $5 per migraine avoided
Option B: $100 / 50 migraines avoided = $2 per migraine avoided
```

And when I did the numbers in my head I thought it was obvious that the answer should be B. What exactly am I missing that led the upper tiers of LWers to select option A?

20

My understanding is that it was once meant to be almost tvtropes-like with a sort of back-and forth linking between pages about concepts on the wiki and posts which refer to those concepts on the main site (in the same way that tvtropes gains a lot of its addictiveness from the back-and-forth between pages for tropes and pages for shows/books/etc).

20

I think we're in agreement then, although I've managed to confuse myself by trying to actually do the Shannon entropy math.

In the event we don't care about birth orders we have two relevant hypotheses which need to be distinguished between (boy-girl at 66% and boy-boy at 33%), so the message length would only need to be 0.9 bits#Definition) if I'm applying the math correctly for the entropy of a discrete random variable. So in one somewhat odd sense Sarah would actually know more about the gender than George does.

Which, given that the original post said

Still, it seems like Sarah knows more about the situation, where George, by being given more information, knows less. His estimate is as good as knowing nothing other than the fact that the man has a child which could be equally likely to be a boy or a girl.

may not actually be implausible. Huh.

140

The standard formulation of the problem is such *you* are the one making the bizarre contortions of conditional probabilities by asking a question. The standard setup has no children with the person you meet, he tells you only that he has two children, and you ask him a question rather than them revealing information. When you ask "Is at least one a boy?", you set up the situation such that the conditional probabilities of various responses are very different.

In this new experimental setup (which is in very real fact a different problem from either of the ones you posed), we end up with the following situation:

```
h1 = "Boy then Girl"
h2 = "Girl then Boy"
h3 = "Girl then Girl"
h4 = "Boy then Boy"
o = "The man says yes to your question"
```

With a different set of conditional probabilities:

```
P(o | h1) = 1.0
P(o | h2) = 1.0
P(o | h3) = 0.0
P(o | h4) = 1.0
```

And it's relatively clear just from the conditional probabilities why we should expect to get an answer of 1/3 in this case now (because there are three hypotheses consistent with the observation and they all predict it to be equally likely).

50

I agree that George definitely does know more information overall, since he can concentrate his probability mass more sharply over the 4 hypotheses being considered, but I'm fairly certain you're wrong when you say that Sarah's distribution is 0.33-0.33-0-0.33. I worked out the math (which I hope I did right or I'll be quite embarassed), and I get 0.25-0.25-0-0.5.

I think your analysis in terms of required message lengths is arguably wrong, because the purpose of the question is to establish the genders of the children and *not* the order in which they were born. That is, the answer to the question "What gender is the child at home?" can always be communicated in a single bit, and we don't *care* whether they were born first or second for the purposes of the puzzle. You have to send >1 bit to Sarah *only* if she actually cares about the order of their births (And specifically, your "1 or 2 bits, depending" result is made by assuming that we don't care about the birth order if they're boys. If we care whether the boy currently out walking is the eldest child regardless of the other child's gender we have to always send Sarah 2 bits).

Another way to look at that result is that when you simply want to ask "What is the probability of a boy or a girl at home?" you are adding up two disjoint ways-the-world-could-be for each case, and this adding operation obscures the difference between Sarah's and George's states of knowledge, leading to them both having the same distribution over that answer.

70

I'll just note in passing that this puzzle is discussed in this post, so you may find it or the associated comments helpful.

I think the specific issue is that in the first case, you're assuming that each of the three possible orderings yields the same chance of your observation (the son out walking with him is a boy). If you assume that his choice of which child to go walking with is random, then the fact that you see a boy makes the (girl, boy) possibilities each less likely, so together they are equally likely to the (boy, boy) one.

Let's define (imagining, for the sake of simplicity, that Omega descended from the heavens and informed you that the man you are about to meet has two children who can both be classified into ordinary gender categories):

```
h1 = "Boy then Girl"
h2 = "Girl then Boy"
h3 = "Girl then Girl"
h4 = "Boy then Boy"
o = "The man is out walking with a boy child"
```

Our initial estimates for each should be 25% before we see any evidence. Then if we make the aforementioned assumption that the man doesn't like one child more than the other:

```
P(o | h1) = 0.5
P(o | h2) = 0.5
P(o | h3) = 0.0
P(o | h4) = 1.0
```

And then we can apply bayes theorem to figure out the posterior probability of each hypothesis:

```
P(h1 | o) = P(h1) * P(o | h1) / P(o)
P(h2 | o) = P(h2) * P(o | h2) / P(o)
P(h3 | o) = P(h3) * P(o | h3) / P(o)
P(h4 | o) = P(h4) * P(o | h4) / P(o)
(where P(o) = P(o | h1)*P(h1) + P(o | h2)*P(h2) + P(o | h3)*P(h3) + P(o | h4)*P(h4))
```

The denominator is a constant factor which works out to 0.5 (meaning "before making that observation I would have assigned it 50% probability"), and overall the math works out to:

```
P(h1 | o) = P(h1) * P(o | h1) / 0.5 = 0.25
P(h2 | o) = P(h2) * P(o | h2) / 0.5 = 0.25
P(h3 | o) = P(h3) * P(o | h3) / 0.5 = 0.0
P(h4 | o) = P(h4) * P(o | h4) / 0.5 = 0.5
```

So the result in the former case is the same as in the latter, seeing one child offers you no information about the gender of the other (unless you assume that the man hates his daughter and never goes walking with her, in which case you get the original 1/3 chance of it being a boy).

The lesson to take away here is the same lesson as the usual bayesian vs frequentist debate, writ very small: if you're getting different answers from the two approaches, it's because the frequentist solution is slipping in unstated assumptions which the bayesian approach forces you to state outright.

This is an extremely clear explanation of something I hadn't even realized I didn't understand. Thank you for writing it.