Open Thread, April 27-May 4, 2014

You know the drill - If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

And, while this is an accidental exception, future open threads should start on Mondays until further notice.

200 comments, sorted by
magical algorithm
Highlighting new comments since Today at 5:22 PM
Select new highlight date

Seth Roberts is dead .

I was considering the Shangri-La diet, but now I'm nervous.

According to information his family graciously posted to his blog, the cause of death was occlusive coronary artery disease with cardiomegaly.

http://blog.sethroberts.net/

Does that make it more likely or less likely that his death was related to his diet?

This is really sad. He definitely was something else when it came to self-experimentation.

The commenters are more concerned about the possible effects of high doses of omega-3.

His blog is back-- it's had occasional down time for a while. The archive copy was down, though.

Probably a good idea to save anything you think is especially important.

It's very sad news and I still ask myself what to make of it. Seth influenced my own QS journey a lot. In the end the it seems like extrapolating health from the kind of data he gathered wasn't possible.

His approach would be expected to optimize for common situations, which may not be the same as optimizing for rare situations. I've been working on a theory that health is not a single thing.

For all I know, he had some intrinsic cardio-vascular problems, and his self-experimentation led to him living longer than he otherwise would have.

I've been working on a theory that health is not a single thing.

That an interesting way of phrasing the sentence.

The issue is that Seth himself based his behavior on the idea that health is a bit like intelligence and it's possible to generalize from a few factors most of the useful information.

Intuitively, it seems likely to me that his death is related to one or more of his self-experiments with supplements. This is based on the observation that it's pretty unusual for 60 year old men to collapse and die, particularly if they have no serious self-reported health problems. Calculating an actual probability seems like it would be pretty hard.

Edit: I suppose there is also an outside chance that this is a hoax. Has the death been reported in any newspapers?

60-yo men die all the time; anytime someone who writes on diet dies, someone is going to say 'I wonder if this proves/disproves his diet claims', no matter what the claims were or their truth. They don't, of course, since even if you had 1000 Seth Roberts, you wouldn't have a particularly strong piece of evidence on correlation of 'being Roberts' and all-cause mortality, and his diet choices were not randomized, so you don't even get causal inference. More importantly, if Roberts had died at any time before his actuarial life expectancy (in the low 80s, I'd eyeball it, given his education, ethnicity, and having survived so long already), people would make this claim.

OK, so let's be a little more precise and play with some numbers.

Roberts published The Shangri-la Diet in 2006. If he's 60 now in 2014 (8 years later), then he was 52 then. Let's say people would only consider his death negatively if he died before his actuarial life expectancy, and I'm going to handwave that as 80; then he has 28 years to survive before his death stops looking bad.

What's his risk of dying if his diet makes zero difference to his health one way or another? Looking at http://www.ssa.gov/OACT/STATS/table4c6.html from 52-80, the per-year risk of death goes from 0.006337 to 0.061620. What's the cumulative risk? We can, I think, calculate it as (1 - 0.06337) ... (1 - 0.061620). A little copy-paste, a little Haskell, and:

> foldr1 (*) $ map (1-) [0.006337,0.006837,0.007347,0.007905,0.008508,0.009116,0.009723,
                         0.010354,0.011046,0.011835,0.012728,0.013743,0.014885,0.016182,
                         0.017612,0.019138,0.020752,0.022497,0.024488,0.026747,0.029212,
                         0.031885,0.034832,0.038217,0.042059,0.046261,0.050826,0.055865,
                         0.061620]
0.5065374918662645

So roughly speaking, Roberts had maybe a 50% chance of surviving from publishing his diet book to a ripe old age. (Suppose Roberts's ideas had halved his risk of death in each time period, which we can implement with a call to map (/2). It's not quite as simple as dividing 50% by 2, but when you rerun the probability, then he'd have a 71% chance of survival, or more relevantly, he still has a 29% chance of dying in that timespan.)

In summary: Life sucks, and diet gurus can be expected to die all the time no matter whether their ideas are great or horrible, so their deaths tell us so little that discussing it at all is probably biasing our beliefs through an anchoring or salience effect.

60-yo men die all the time; anytime someone who writes on diet dies, someone is going to say 'I wonder if this proves/disproves his diet claims', no matter what the claims were or their truth.

Agreed.

More importantly, if Roberts had died at any time before his actuarial life expectancy (in the low 80s, I'd eyeball it, given his education, ethnicity, and having survived so long already), people would make this claim.

Not sure about that, for example if he had died at the age of 81 in a car accident. Although I appreciate your effort, I am not sure that you have the reference class of events correct. The evidence suggest that Roberts died (1) suddenly; (2) due to failure of some bodily system; (3) at an age which is well under his life expectancy. The prior probability of this happening has got to be far less than the prior probability of him simply dying from any cause before his actuarial life expectancy.

At the same time, he was apparently consuming large amounts of butter, omega fatty acids from flax seeds, and other esoteric things. Of course it's difficult to even being estimating the risk inherent in doing such things.

Ironically, Seth Roberts was a big believer in "n=1 experiments."

Do you have an estimate of the probability that Robert's death is related to his supplement regime?

Not sure about that, for example if he had died at the age of 81 in a car accident. Although I appreciate your effort, I am not sure that you have the reference class of events correct.

The all-cause mortality figures were chosen for convenience. I'm sure one could dig up more appropriate figures that exclude accident, homicide, etc. But the reference class is still going to be pretty broad: if Roberts had committed suicide, had developed cancer, had a stroke rather than heart attack (or whatever), had a fall, people would be speculating on biological roots ('perhaps he was going senile thanks to the oils' or 'he claimed the flax seed oil was helping balance, but he fell all the same!'). And I'm not sure that the better figures would be that much lower: this isn't a young cohort - few elderly people are murdered or die in car accidents, AFAIK, and mortality is primarily from diseases and other health problems.

The prior probability of this happening has got to be far less than the prior probability of him simply dying from any cause before his actuarial life expectancy.

As I've pointed out, the prior is quite high that he would die in a 'suspicious' way.

Do you have an estimate of the probability that Robert's death is related to his supplement regime?

No, and I refuse to give one on a problem which reflects motivated cognition on the part of many people based on heavily-selected evidence & post hoc reasoning. Any estimate would anchor me and bias my future thinking on diet matters. The story is far too salient, the evidence far too weak.

I'm sure one could dig up more appropriate figures that exclude accident, homicide, etc. But the reference class is still going to be pretty broad: if Roberts had committed suicide, had developed cancer, had a stroke rather than heart attack (or whatever), had a fall, people would be speculating on biological roots

I would have to agree with that, however some causes of death are more suspicious than others. In this case, he died apparently died suddenly, at an age where sudden death is rather unusual in people with no self-reported history of serious health problems. Also, this kind of sudden death is usually the result of cardiovascular problems, i.e. heart attack or stroke. Last, he was known to be regularly consuming a lot of concentrated fat on a regular basis (half a stick of butter a day; and perhaps olive oil and flax seed on top of it); fatty foods have long been suspected as playing a role in cardiovascular problems, that they cause lipids to build up in the blood stream and clog up the works.

It would be very tricky to do the equations, if it's possible at all, but it seems reasonable to think it's likely that his supplement regimen played a role in his demise.

As I've pointed out, the prior is quite high that he would die in a 'suspicious' way.

Well do you agree that what happened is more 'suspicious' than if he had died at the age of 75 from lung cancer?

No, and I refuse to give one on a problem which reflects motivated cognition on the part of many people based on heavily-selected evidence & post hoc reasoning.

Suit yourself, but it strikes me as confusing that I would make a claim and you would respond with a calculation which seems to address the claim but actually doesn't. It makes me think you are trying to subtly change the subject. Which is fine, but I think you should be explicit about it. Otherwise it seems like you are attacking a strawman.

In this case, he died apparently died suddenly, at an age where sudden death is rather unusual in people with no self-reported history of serious health problems. Also, this kind of sudden death is usually the result of cardiovascular problems, i.e. heart attack or stroke. Last, he was known to be regularly consuming a lot of concentrated fat on a regular basis (half a stick of butter a day; and perhaps olive oil and flax seed on top of it); fatty foods have long been suspected as playing a role in cardiovascular problems, that they cause lipids to build up in the blood stream and clog up the works.

Again, this is post hoc reasoning conjured upon observing the exact particulars of his death, and so suspect even without considering additional questions like whether fat is all it's cracked up to be, what his medical tests were saying, etc.

Well do you agree that what happened is more 'suspicious' than if he had died at the age of 75 from lung cancer?

Yes.

Suit yourself, but it strikes me as confusing that I would make a claim and you would respond with a calculation which seems to address the claim but actually doesn't.

My calculation addresses a major part of the Bayesian calculation: the probability of an observed event ('death') conditional on the hypothesis ('his diet is harmful') being false. Since dying aged 52-80 is so common, that sharply limits how much could ever be inferred from observing dying.

Again, this is post hoc reasoning conjured upon observing the exact particulars of his death

Actually I don't know the exact particulars of the death. But I do agree with what I think is your basic point here -- it's extremely easy to make these sorts of connections with the benefit of hindsight and that ease might be coloring my analysis. At the same time, I do think that -- in fairness -- the death is pretty high on the 'suspicious' scale so I stand by my earlier claim.

My calculation addresses a major part of the Bayesian calculation:

Perhaps, but it seems to me you are throwing the baby out with the bathwater a bit here by ignoring the facts which make this death quite a bit more 'suspicious' than other deaths of men in that age range. More importantly, you don't seem to dispute that your calculation doesn't really address my claim.

Look, I agree with your basic point -- the premature death of a diet guru, per se, doesn't say much about the efficacy or danger of the diet guru's philosophy. No calculation is necessary to convince me.

More importantly, you don't seem to dispute that your calculation doesn't really address my claim.

I did dispute that:

My calculation addresses a major part of the Bayesian calculation...that sharply limits how much could ever be inferred from observing [Roberts] dying.

(A simple countermeasure to avoid biasing yourself with anecdotes: spend time reading in proportion to sample size. So you're allowed to spend 10 minutes reading about Roberts's 1 death if you then spend 17 hours repeatedly re-reading a study on how fat consumption did not predict increased mortality in a sample of 100 men.)

I did dispute that:

My calculation addresses a major part of the Bayesian calculation...that sharply limits how much could ever be inferred from observing [Roberts] dying.

I wouldn't call it "major" because (1) you refuse to assign a probability to an event I stated I thought was likely; and (2) the main point of your calculation was pretty non-controversial and even without a calculation I doubt anyone would seriously dispute it.

Let's do this: Is there anything I stated with which you disagree? If so, please quote it. TIA.

I wouldn't call it "major" because (1) you refuse to assign a probability to an event I stated I thought was likely;

It puts an upper bound as I said. Plug the specific conditional I calculated into Bayes theorem and see what happens. Or look at a special case: suppose conditional on the diet not being harmful, Roberts had a 50% chance of dying before 80; now, what is the maximal amount in terms of odds or decibels or whatever that you could ever update your prior upon observing Roberts's death assuming the worsened diet risk is >50%? Is this a large effect size? Or small?

(Now take into account everything you know about correlations, selection effects, the plausibility of the underlying claims about diet, what is known about Roberts's health, how likely you are to hear about deaths of diet gurus, etc...)

(2) the main point of your calculation was pretty non-controversial and even without a calculation I doubt anyone would seriously dispute it.

One would think so.

It puts an upper bound as I said.

So what? One can trivially put an upper and lower bound on any probability: No probability can exceed 1 or be lower than 0. But it ain't "major" to say so. On the contrary, it's trivial.

Anyway, please answer my question: Was there anything in my original post with which you disagreed? If so, please quote it. TIA.

Seth Roberts' last article

It was nice to know all that but I did wonder: Was I killing myself? Fortunately I could find out. A few months before my butter discovery, I had gotten a “heart scan” – a tomographic x-ray of my circulatory system. These scans are summarized by an Agatston score, a measure of calcification. Your Agatston score is the best predictor of whether you will have a heart attack in the next few years. After a year of eating a half stick of butter every day, I got a second heart scan. Remarkably, my Agatston score had improved (= less calcification), which is rare. Apparently my risk of a heart attack had gone down.

Some ambiguity about the Agatston score

Agatston's overview of his test

Thank you for your post, which raises some interesting questions. Of course at this point it is not known if Roberts died of a heart attack, although the smart money is on a cardio-vascular problem - heart attack, stroke, aneurism, etc.

The first question is whether the Agatston score is as good as it's made out to be by Doctor Agatston. Another question is whether it is skillful in the case of Roberts himself. Probably none of the people who were studied were eating half a stick of butter a day, along with lots of flax seeds, extra light olive oil, and who knows what else.

I'm not a doctor, but a quick search on Wikipedia turns up that the most common cause of sudden death in people over 30 is coronary artery atheroma (arteriosclerosis), but other common causes are genetically determined or at least have a significant genetic component. I suppose some of these are easier to detect (hypertrophic cardiomyopathy perhaps?), so we can probably rule them out for somebody like Roberts who constantly monitored his health and bragged about how healthy he was. Other conditions are probably more difficult to detect with standard tests.

The puzzle has a lot of pieces missing, to be sure. Another question is whether Roberts was telling the whole truth about his health. Or about his diet for that matter. It's even not out of the question that he has gained a lot of weight.

So roughly speaking, Roberts had maybe a 50% chance of surviving from publishing his diet book to a ripe old age.

If his actuarial life expectancy was 80 and he had died at 79 it wouldn't have looked particularly suspicious. But according to your data, his probability of dying between 52 and 60 was only about 7.5%, which is not terribly low, but still enough to warrant reasonable doubt, especially considering the circumstances of his death.

But according to your data, his probability of dying between 52 and 60 was only about 7.5%, which is not terribly low, but still enough to warrant reasonable doubt, especially considering the circumstances of his death.

I think the more interesting question is the probability of a man in his age range (who is not obese; not a smoker; and has no serious self-reported history of health problems) suddenly collapsing and dying. I don't know the answer to this question, but it's a pretty unusual event.

By the way, here is a video of Seth Roberts speaking about his butter experiment a few years ago. Seth Roberts mentions that he eats a half a stick of butter a day on top of his Omega-3 regimen. (And probably this is on top of daily consumption of raw olive oil).

http://vimeo.com/14281896

At around 11:00, an apparent cardiologist concedes that the butter regimen may very well improve brain function but he warns Roberts that he is risking clogging up the arteries in his brain and points out that Roberts brain function won't be so great if he has a stroke. Roberts is pretty dismissive of the comment and points out that there is reason to believe the role of fat consumption in atherosclerosis over-emphasized or mistaken.

Still, if someone suddenly collapses and dies, from what I understand it's usually a cardiovascular problem -- a blood clot; stroke; aneurism; heart attack, internal bleeding, etc. And Roberts was consuming copious amounts of foods which are widely believed to have a big impact on the cardiovascular system.

It's silly to ignore this information when assessing probabilities. Here's an analogy: Suppose that Prince William has a newborn son and you are going to place a bet on what the child's name will be. You might reason that the most common male given name in the world is "Mohamed" and therefore the smart money is on "Mohamed." Of course you would lose your money.

The flaw in this type of reasoning is that when assessing probabilities, there is a requirement that you use all available information.

I imagine Gwern would respond that he is merely setting an upper bound. But that's silly and pointless too. If 90% of male children in Saudi Arabia are named "Mohamed," we can infer that the probability the Royal Baby will be named "Mohamed" does not exceed 90%. But so what? That's trivial.

but still enough to warrant reasonable doubt, especially considering the circumstances of his death.

I disagree (reasonable doubt under what assumptions? in what model? can you translate this to p-values? would you take that p-value remotely seriously if you saw it in a study where n=1?), and I've already pointed out many systematic biases and problems with attempting to infer anything from Roberts's death.

I'm not saying we can scientifically infer from his premature death that his diet was unhealthy.

I'm saying that his premature death is informal evidence that his diet at best didn't have a significant positive impact on life expectancy, and at worst was actively harmful. I can't quantify how much, but you were the one who attempted a quantitative argument and I've just criticized your argument, namely your strawman definition of "suspicious death", using your own data and assumptions, hence it seems odd that you now ask me for assumptions and p-values.

Edit: I suppose there is also an outside chance that this is a hoax. Has the death been reported in any newspapers?

Yes fittingly from Ryan Holiday: http://betabeat.com/2014/04/personal-science-pioneer-seth-roberts-passes-away/

But I don't think a normal newspaper would do more fact checking then the people who read Seth's blog and comment on it.

I just graduated from FIU with a bachelor's in philosophy and a minor in mathematics. I'd like to thank my parents, God and Eliezer Yudkowsky (whose The Sequences I cited in each of the five papers I had to turn in during my final semester).

I can't tell whether the serial comma joke here is intentional.

Grats! Hope you have a job lined up.

God and Eliezer Yudkowsky

Redundant? Mutually exclusive? I can't decide.

I have to say, I seriously don't get the Bayesian vs Frequentist holy wars. It seems to me the ratio of importance to education of its participants is ridiculously low.

Bayesian and frequentist methods are sets of statistical tools, not sacred orders to which you pledge a blood oath. Just understand the usage of each tools, and the fact that virtually any model of something that happens in the real world is going to be misspecified.

It's because Bayesian methods really do claim to be more than just a set of tools. They are supposed to be universally applicable.

I have to say, I seriously don't get the Bayesian vs Frequentist holy wars.

This is a bit of an exaggeration.

Additionally, you are only talking about the 'sets of statistical tools', where in my experience the bigger disagreement often lies in whether a person accepts that probabilities can be subjective or not; And yes - this does matter.

Can you please give an example of where the possible subjectivity of probabilities matter? I mean this in earnest.

'From my point of view the probability for X is Y, but from his point of view at the time it would've been Z'. (subjective) vs 'The Probability for X is Y' ('objective').

Honestly though, frequentists use subjective probabilities all the time and you can argue that frequentism is just as subjective as bayesinism, so even that disagreement is quite muddy.

Can you be more concrete? When would this matter for two people trying to share a model and make predictions of future events?

Part of it is that Bayesianism claims to be not just a better statistical tool, but a new and better epistemology, a replacement and improvement over Aristotelian logic.

There are a bunch of issues involved. It hard to speak about them because the term Bayesianism is encompasses a wide array of ideas and everytime it's used it might refer to a different subset of that cluster of ideas.

Part of LW is that it's a place to discuss how an AGI could be structured. As such we care about the philosophic level of how you come to know that something is true. As such there an interest into going as basic as possible when looking at epistemology. There are issues about objective knowledge versus "subjective" Bayesian priors that are worth thinking about.

We live at a time where up to 70% of scientific research can't be replicated. Frequentism might not be to blame for all of that, but it does play it's part. There are issues such an the Bem paper about porno-precognition where frequentist techniques did suggest that porno-precognition is real but analysing Bems data with Bayesian methods suggested it's not.

There are further issues that a lot of additional assumptions are loaded into the word Bayesianism if you use that word on LessWrong. What Bayesianism taught me speaks about a bunch of issues that only have indirectly something to do with Bayesian tools vs. Frequentist tools.

Let's say I want to decide how much salt I should eat. I do follow the consensus that salt is bad and therefore have some prior that salt is bad. Then a new study comes along and says that low salt diets are unhealthy. If I want to make good decisions I have to ask: How much should I update? There no good formal way for making such decisions. We lack a good framework for doing this. Bayes rule is the answer to that problem that provides the promise of a solution. The solution to wait a few years and then read a meta review is unsatisfying.

In the absence of a formal way to do the reasoning, many people do use informal ways of updating towards new evidence. Cognitive bias research suggest that the average person isn't good at this.

Just understand the usage of each tools, and the fact that virtually any model of something that happens in the real world is going to be misspecified.

That sentence is quite easy to say but it effectively means there no such thing as pure absolute objective truth. If you use tools A you get truth X and if you use tools B you get truth Y. Neither X or Y are "more true". That's not an appealing conclusion to many people.

Full disclosure: I have papers using B (on structure learning using BIC, which is an approximation to a posterior of a graphical model), and using F (on estimation of causal effects). I have no horse in this race.


Bayes rule is the answer to that problem that provides the promise of a solution.

See, this is precisely the kind of stuff that makes me shudder, that regularly appears on LW, in an endless stream. While Scott Alexander is busy bible thumping data analysts on his blog, people here say stuff like this.

Bayes rule doesn't provide shit. Bayes rule just says that p(A | B) p(B) = p(B | A) p(A).

Here's what you actually need to make use of info in this study:

(a) Read the study.

(b) See if they are actually making a causal claim.

(c) See if they are using experimental or observational data.

(d) Experimental? Do we believe the setup? Are we in a similar cohort? What about experimental design issues? Observational? Do they know what they are doing, re: causality-from-observational-data? Is their model that permits this airtight (usually it is not, see Scott's post on "adjusting for confounders". Generally to really believe that adjusting for confounders is reasonable you need a case where you know all confounders are recorded by definition of the study, for instance if doctors prescribe medicine based only on recorded info in the patient file).

(e) etc etc etc

I mean what exactly did you expert, a free lunch? Getting causal info and using it is hard.


p.s. If you skeptical about statistics papers that adjust for confounders, you should also be skeptical about missing data papers that assume MAR (missing at random). It is literally the same assumption.

You might want to read a bit more precisely. I did choose my words when I said "promise of a solution" instead of "a solution".

In particular MetaMed speaks about wanting to produce a system of Bayesian analysis of medical papers. (Bayesian mathematical assessment of diagnosis)

I mean what exactly did you expert, a free lunch? Getting causal info and using it is hard.

You miss the point. When it comes to interviewing candidates for job then we found out that unstructured human assessment doesn't happen that good.

It could very well be that the standard unstructured way of reading papers is not optimal and that we should have Bayesian beliefs nets in which we plug numbers such as whether the experiment is experimental or observational.

Whether MetaMed or someone else succeeds at that task and provides a good improvement on the status quo isn't certain but there are ideas to explore.

Is it clear that MetaMed as group of self professed Bayesians provide a useful service? Maybe, maybe not. On the other hand the philosophy on which MetaMed operates is not the standard philosophy on which the medical establishment operates.

I don't know how Metamed works (and it's sort of their secret sauce, so they probably will not tell us without an NDA). I am guessing it is some combination of doing (a) through (e) above for someone who cannot do it themselves, and possibly some B stats. Which seems like a perfectly sensible business model to me!

I don't think the secret sauce is in the B stats part of what they are doing, though. If we had a hypothetical company called "Freqmed" that also humanwaved (a) through (e), and then used F stats I doubt they would get non-sensible answers. It's about being sensible, not your identity as a statistician.


I can be F with Bayes nets. Bayes nets are just a conditional independence model.


I don't know how successful Metamed will be, but I honestly wish them the best of luck. I certainly think there is a lot of crazy out there in data analysis, and it's a noble thing to try to make money off of making things more sensible.


The thing is, I don't know about a lot of the things that get talked about on LW. I do know about B and F a little bit, and about causality a little bit. And a huge chunk of stuff people say is just plain wrong. So I tell them it's wrong, but they keep going and don't change what they say at all. So how should I update -- that folks on this rationalist community generally don't know what they are talking about and refuse to change?

It's like wikipedia -- the first sentence in the article on confounders is wrong on wikipedia (there is a very simple 3 node example that violates that definition). The talk page on Bayesian networks is a multi-year tale of woe and ignorance. I once got into an edit war with a resident bridge troll for that article, and eventually gave up and left, because he had more time. What does that tell me about wikipedia?

If we had a hypothetical company called "Freqmed"

But we don't. MetaMed did come out of a certain kind of thinking. The project had a motivation.

I do know about B and F a little bit, and about causality a little bit.

Just because you know what the people in the statistic community mean when they say "Bayesian" doesn't automatically mean that you know what someone on LW means when he says Bayesian.

If you look at the "What Bayesianism taught me", there a person who changed their beliefs through learning about Bayesianism. Do the points he makes have something to do with Frequentism vs. Bayesianism? Not directly. On the other hand he did change major beliefs about he thinks about how the world and epistemology.

That means that the term Bayesianism as used in that article isn't completely empty.

It's about being sensible

Sensiblism might be a fun name for a philosophy. On the first LW meetup where I attended one of the participants had a scooter. My first question was about his traveling speed and how much time he effectively wins by using it. On that question he gave a normal answer.

My second question was over the accident rate of scooters. He replied something along the lines: "I really don't know, I should research the issue more in depth and get the numbers." That not the kind of answer normal people give when faced with the question for safety of the mode of travel.

You could say he's simply sensible while 99% of the population that out there that would answer the question differently isn't. On the other hand it's quite difficult to explain to those 99% that they aren't sensible. If you prod them a bit they might admit that knowing accident risks is useful for making a decision about one's mode of travel but they don't update on a deep level.

Then people like you come and say: "Well of course we should be sensible. There no need to point is about explicitly or to give it a fancy name. Being sensible should go without saying."

The problem is that in practice it doesn't go without saying and speaking about it is hard. Calling it Bayesianism might be a very confusing way to speak about it but it seems to be an improvement over having no words at all. Maybe tabooing Bayesianism as word on LW would be the right choice. Maybe the word produces more problems than it solves.

It's like wikipedia -- the first sentence in the article on confounders is wrong on wikipedia.

"In statistics, a confounding variable (also confounding factor, a confound, or confounder) is an extraneous variable in a statistical model that correlates (directly or inversely) with both the dependent variable and the independent variable." is at the moment that sentence. How would you change the sentence? There no reason why we shouldn't fix that issue right now.

How would you change the sentence? There no reason why we shouldn't fix that issue right now.

Counterexamples to a definition (this example is under your definition but is clearly not what we mean by confounder) are easier than a definition. A lot of analytic philosophy is about this. Defining "intuitive terms" is often not as simple as it seems. See, e.g.:

http://arxiv.org/abs/1304.0564

If you think you can make a "sensible" edit based on this paper, I will be grateful if you did so!


re: the rest of your post, words mean things. B is a technical term. I think if you redefine B as internal jargon for LW you will be incomprehensible to stats/ML people, and you don't want this. Communication across fields is hard enough as it is ("academic coordination problem"), let's not make it harder by not using standard terminology.

Maybe tabooing Bayesianism as word on LW would be the right choice. Maybe the word produces more problems than it solves.

I am 100% behind this idea (and in general taboo technical terms unless you really know a lot about it).

Counterexamples to a definition are easier than a definition. See, e.g.:

But they don't solve the problem of Wikipedia being in your judgement wrong about this point.

re: the rest of your post, words mean things. B is a technical term.

If you look at the dictionary you will find that most words have multiple meanings.They also happen to evolve meaning over time.

Let's see if I can precommit to not posting here anymore.

It's about being sensible, not your identity as a statistician.

Speaking of, an interesting paper which distinguishes the Fisher approach to testing from the Neyman-Pearson approach and shows how you can unify/match some of that with Bayesian methods.

We live at a time where up to 70% of scientific research can't be replicated. Frequentism might not be to blame for all of that, but it does play it's part. There are issues such an the Bem paper about porno-precognition where frequentist techniques did suggest that porno-precognition is real but analysing Bems data with Bayesian methods suggested it's not.

It seems to me that there's a bigger risk from Bayesian methods. They're more sensitive to small effect sizes (doing a frequentist meta-analysis you'd count a study that got a p=0.1 result as evidence against, doing a bayesian one it might be evidence for). If the prior isn't swamped then it's important and we don't have good best practices for choosing priors; if the prior is swamped then the bayesianism isn't terribly relevant. And simply having more statistical tools available and giving researchers more choices makes it easier for bias to creep in.

Bayes' theorem is true (duh) and I'd accept that there are situations where bayesian analysis is more effective than frequentist, but I think it would do more harm than good in formal science.

doing a frequentist meta-analysis you'd count a study that got a p=0.1 result as evidence against

Why would you do that? If I got a p=0.1 result doing a meta-analysis, I wouldn't be surprised at all since things like random-effects means it takes a lot of data to turn in a positive result at the arbitrary threshold of 0.05. And as it happens, in some areas, an alpha of 0.1 is acceptable: for example, because of the poor power of tests for publication bias, you can find respected people like Ioannides using that particular threshold (I believe I last saw that in his paper on the binomial test for publication bias).

If people really acted that way, we'd see odd phenomenon where people saw successive meta-analysts on whether grapes cure cancer: 0.15 that grapes cure cancer (decreases belief grapes cure cancer), 0.10 (decreases), 0.07 (decreases), someone points out that random-effects is inappropriate because studies show very low heterogeneity and the better fixed-effects analysis suddenly reveals that the true p-value is now at 0.05 (everyone's beliefs radically flip as they go from 'grapes have been refuted and are quack alt medicine!' to 'grapes cure cancer! quick, let's apply to the FDA under a fast track'). Instead, we see people acting more like Bayesians...

And simply having more statistical tools available and giving researchers more choices makes it easier for bias to creep in.

Is that a guess, or a fact based on meta-studies showing that Bayesian-using papers cook the books more than NHST users with p-hacking etc?

everyone's beliefs radically flip as they go from 'grapes have been refuted and are quack alt medicine!' to 'grapes cure cancer! quick, let's apply to the FDA under a fast track'

Turns out I am overoptimistic and in some cases people have done just that: interpreted a failure to reject the null (due to insufficient power, despite being evidence for an effect) as disproving the alternative in a series of studies which all point the same way, only changing their minds when an individually big enough study comes out. Hauer says this is exactly what happened with a series of studies on traffic mortalities.

(As if driving didn't terrify me enough, now I realize traffic laws and road safety designs are being engineered by vulgarized NHST practitioners who apparently don't know how to patch the paradigm up with emphasis on power or meta-analysis.)

doing a frequentist meta-analysis you'd count a study that got a p=0.1 result as evidence against

No. The most basic version of meta-analysis is, roughly, that if you have two p=0.1 studies, the combined conclusion is p=0.01.

To all your points about the overloading of "Bayesian", fair enough. I guess I just don't see why that overloading is necessary.

We lack a good framework for doing this. Bayes rule is the answer to that problem that provides the promise of a solution. The solution to wait a few years and then read a meta review is unsatisfying.

Sure Bayes rule provides a formalization of updating beliefs based on evidence, but you can still be dead wrong. In particular, setting a prior on any given issue isn't enough. You have to be prepared to update for evidence of the form "I am really bad at setting priors". And really, priors are just a (possibly arbitrary) way of digesting existing evidence. Sometimes they can be very useful (avoiding privileging the hypothesis) but sometimes they are quite arbitrary.

There are issues such an the Bem paper about porno-precognition where frequentist techniques did suggest that porno-precognition is real but analysing Bems data with Bayesian methods suggested it's not.

According to the Slate Star Codex article Bem's results stand up to bayesian analysis quite well (that is, it has a strong Bayes factor). The only exception he mentioned was "I begin with a very low prior for psi phenomena, and a higher prior for the individual experiments and meta-analysis being subtly corrupt"; but there's nothing especially helpful about this in actually fixing the experimental design and meta-analysis.

Part of LW is that it's a place to discuss how an AGI could be structured. As such we care about the philosophic level of how you come to know that something is true. As such there an interest into going as basic as possible when looking at epistemology.

How you get from AGI to epistemology eludes me. As long as the AGI can accurately model its interactions with the environment, that's really all it needs (or can hope) to do.

That sentence is quite easy to say but it effectively means there no such thing as pure absolute objective truth. If you use tools A you get truth X and if you use tools B you get truth Y. Neither X or Y are "more true". That's not an appealing conclusion to many people.

One of them is more useful for prediction and inference. They can guide you towards observing mechanisms useful for future hypothesis generation. That's all you can hope for. Especially in the case of "are low-salt diets healthy". A "Yes" or "No" to that question will never be truthful, because "health" and "for what segments of the population" and "in conjunction with what other lifestyle factors" are left underspecified. And you'll never get rid of the kernel of doubt that the low-sodium lobby has been the silent force behind all the anti-salt research this whole time.

The best you can do is provide enough evidence that anyone who points out your hypothesis is not truth can be reasonably called a pedant or conspiracy theorist, but not 100% guaranteed wrong.

As you might see, I am a fan of the idea of Dissolving epistemology.

Can you point to examples of these "holy wars"? I haven't encountered something I'd describe like that, so I don't know if we've been seeing different things, or just interpreting it differently.

To me it looks like a tension between a method that's theoretically better but not well-established, and a method that is not ideal but more widely understood so more convenient - a bit like the tension between the metric and imperial systems, or between flash and html5.

To me it looks like a tension between a method that's theoretically better


It's because Bayesian methods really do claim to be more than just a set of tools. They are supposed to be universally applicable.


[etc.]

Ugh. Here is a good heuristic:

"Not in stats or machine learning? Stop talking about this."

Dude, I'm being genuinely curious about what "holy wars" he's talking about. So far I got:

  • a definition of "holy war" in this context
  • a snotty "shut up, only statisticians are allowed to talk about this topic"

... but zero actual answers, so I can't even tell if he's talking about some stupid overblown bullshit, or if he's just exaggerating what is actually a pretty low-key difference in opinion.

A "holy war" between Bayesians and frequentists exists in the modern academic literature for statistics, machine learning, econometrics, and philosophy (this is a non-exhaustive list).

Bradley Efron, who is arguably the most accomplished statistician alive, wrote the following in a commentary for Science in 2013 [1]:

The term "controversial theorem" sounds like an oxymoron, but Bayes' theorem has played this part for two-and-a-half centuries. Twice it has soared to scientific celebrity, twice it has crashed, and it is currently enjoying another boom. The theorem itself is a landmark of logical reasoning and the first serious triumph of statistical inference, yet is still treated with suspicion by most statisticians. There are reasons to believe in the staying power of its current popularity, but also some signs of trouble ahead.

[...]

Bayes' 1763 paper was an impeccable exercise in probability theory. The trouble and the subsequent busts came from overenthusiastic application of the theorem in the absence of genuine prior information, with Pierre-Simon Laplace as a prime violator. Suppose that in the twins example we lacked the prior knowledge that one-third of twins are identical. Laplace would have assumed a uniform distribution between zero and one for the unknown prior probability of identical twins, yielding 2/3 rather than 1/2 as the answer to the physicists' question. In modern parlance, Laplace would be trying to assign an "uninformative prior" or "objective prior", one having only neutral effects on the output of Bayes' rule. Whether or not this can be done legitimately has fueled the 250-year controversy.

Frequentism, the dominant statistical paradigm over the past hundred years, rejects the use of uninformative priors, and in fact does away with prior distributions entirely. In place of past experience, frequentism considers future behavior. An optimal estimator is one that performs best in hypothetical repetitions of the current experiment. The resulting gain in scientific objectivity has carried the day, though at a price in the coherent integration of evidence from different sources, as in the FiveThirtyEight example.

The Bayesian-frequentist argument, unlike most philosophical disputes, has immediate practical consequences.

In another paper published in 2013, Efron wrote [2]:

The two-party system [Bayesian and frequentist] can be upsetting to statistical consumers, but it has been a good thing for statistical researchers — doubling employment, and spurring innovation within and between the parties. These days there is less distance between Bayesians and frequentists, especially with the rise of objective Bayesianism, and we may even be heading toward a coalition government.

The two philosophies, Bayesian and frequentist, are more orthogonal than antithetical. And of course, practicing statisticians are free to use whichever methods seem better for the problem at hand — which is just what I do.

Thirty years ago, Efron was more critical of Bayesian statistics [3]:

A summary of the major reasons why Fisherian and NPW [NeymanPearsonWald] ideas have shouldered Bayesian theory aside in statistical practice is as follows:

  1. Ease of use: Fisher’s theory in particular is well set up to yield answers on an easy and almost automatic basis.
  2. Model building: Both Fisherian and NPW theory pay more attention to the preinferential aspects of statistics.
  3. Division of labor: The NPW school in particular allows interesting parts of a complicated problem to be broken off and solved separately. These partial solutions often make use of aspects of the situation, for example, the sampling plan, which do not seem to help the Bayesian.
  4. Objectivity: The high ground of scientific objectivity has been seized by the frequentists.

None of these points is insurmountable, and in fact, there have been some Bayesian efforts on all four. In my opinion a lot more such effort will be needed to fulfill Lindley’s prediction of a Bayesian 21st century.

The following bit of friendly banter in 1965 between M. S. Bartlett and John W. Pratt shows that the holy war was ongoing 50 years ago [4]:

Bartlett: I am not being altogether facetious in suggesting that, while non-Bayesians should make it clear in their writings whether they are non-Bayesian Orthodox or non-Bayesian Fisherian, Bayesians should also take care to distinguish their various denominations of Bayesian Epistemologists, Bayesian Orthodox and Bayesian Savages. (In fairness to Dr Good, I could alternatively have referred to Bayesian Goods; but, oddly enough, this did not sound so good.)

Pratt: Professor Bartlett is correct in classifying me a Bayesian Savage, though I might take exception to his word order. On the whole, I would rather be called a Savage Bayesian than a Bayesian Savage. Of course I can quite see that Professor Bartlett might not want to admit the possibility of a Good Bayesian.

For further reading I recommend [5], [6], [7].

[1]: Efron, Bradley. 2013. “Bayes’ Theorem in the 21st Century.” Science 340 (6137) (June 7): 1177–1178. doi:10.1126/science.1236536.

[2]: Efron, Bradley. 2013. “A 250-Year Argument: Belief, Behavior, and the Bootstrap.” Bulletin of the American Mathematical Society 50 (1) (April 25): 129–146. doi:10.1090/S0273-0979-2012-01374-5.

[3]: Efron, B. 1986. “Why Isn’t Everyone a Bayesian?” American Statistician 40 (1) (February): 1–11. doi:10.1080/00031305.1986.10475342.

[4]: Pratt, John W. 1965. “Bayesian Interpretation of Standard Inference Statements.” Journal of the Royal Statistical Society: Series B (Methodological) 27 (2): 169–203. http://www.jstor.org/stable/2984190.

[5]: Senn, Stephen. 2011. “You May Believe You Are a Bayesian but You Are Probably Wrong.” Rationality, Markets and Morals 2: 48–66. http://www.rmm-journal.com/htdocs/volume2.html.

[6]: Gelman, Andrew. 2011. “Induction and Deduction in Bayesian Data Analysis.” Rationality, Markets and Morals 2: 67–78. http://www.rmm-journal.com/htdocs/volume2.html.

[7]: Gelman, Andrew, and Christian P. Robert. 2012. “‘Not Only Defended but Also Applied’: The Perceived Absurdity of Bayesian Inference”. Statistics; Theory. arXiv (June 28).

Dude, I'm being genuinely curious about what "holy wars" he's talking about.

For lots of "holy war" anecdotes, see The Theory That Would Not Die by Sharon Bertsch McGrayne.

...I can't even tell if he's talking about some stupid overblown bullshit, or if he's just exaggerating what is actually a pretty low-key difference in opinion.

Do you consider personal insults, accusations of fraud, or splitting academic departments along party lines to be "a pretty low-key difference in opinion"? If so, then it is "overblown bullshit," otherwise it isn't.

Ilya responded to your second paragraph not the first one. metric vs. imperial or flash vs. html5 are not good analogies.

The term "holy war" or "religious war" is often used to describe debates where people advocate for a side with an intensity disproportionate to the stakes, (e.g. the proper pronunciation of "gif", vi vs. emacs, surrogate vs. natural primary keys in the RDBM). That's how I read the OP, and it's fitting in context.

Sure, I'm just not sure which debates he's referring to ... is it on LessWrong? Elsewhere?

Can you point to examples of these "holy wars"? I haven't encountered something I'd describe like that, so I don't know if we've been seeing different things, or just interpreting it differently.

Various bits of Jaynes's "Confidence intervals vs Bayesian intervals" seem holy war-ish to me. Perhaps the juiciest bit (from pages 197-198, or pages 23-24 of the PDF):

I first presented this result to a recent convention of reliability and quality control statisticians working in the computer and aerospace industries; and at this point the meeting was thrown into an uproar, about a dozen people trying to shout me down at once. They told me, "This is complete nonsense. A method as firmly established and thoroughly worked over as confidence intervals can't possibly do such a thing. You are maligning a very great man; Neyman would never have advocated a method that breaks down on such a simple problem. If you can't do your arithmetic right, you have no business running around giving talks like this".

After partial calm was restored, I went a second time, very slowly and carefully, through the numerical work [...] with all of them leering at me, eager to see who would be the first to catch my mistake [...] In the end they had to concede that my result was correct after all.

To make a long story short, my talk was extended to four hours (all afternoon), and their reaction finally changed to: "My God – why didn't somebody tell me about these things before? My professors and textbooks never said anything about this. Now I have to go back home and recheck everything I've done for years."

This incident makes an interesting commentary on the kind of indoctrination that teachers of orthodox statistics have been giving their students for two generations now.

During our Hamburg Meetup we discussed selection pressure on humans. We agreed that there is almost none on mutations affecting health in general due to medicine. But we agreed that there is tremendous pressure on contraception. We identified four ways evolution works around contraception. We discussed what effects this could have on the future of society. The movie Idiocracy was mentioned. This could be a long term (a few generations) existential risk.

The four ways evolution works around contraception:

  • Biological factors. Examples are hormones compensating the contraception effects of the pill or allergies against condoms. These are easily recognized, measured and countered by the much faster operating pharma industry. There are also little ethical issues with this.

  • Subconscious mental factors. Factors mostly leading to non- or mis-use of contraception. Examples are carelessness, impulsiveness, fear, and insufficient understanding of the contraceptives usage. These are what some fear leads to collective stultification. There are ethical injunctions to 'cure' these factors even if medically/therapeutically possible.

  • Conscious mental factors. Factors leading to explicit family planning e.g. children/family as terminal goals. These lead to a conscious use of contraception. The effect is less pronounced but likely leads to healthy and better educated children. These are actively encouraged but my personal impression is that this is less an area suspectible to education (because it depends on ones terminal goals).

  • Group selection factors. These are factors favoring groups which collectively have more children. The genetic effects are likely weak here but the memetic effects are strong. A culture with social norms against contraception or for large families are likely to out-birth other groups.

Any mistakes? Do you agree? Are we missing something?

EDIT: Fixed link, typos

Group selection factors. These are factors favoring groups which collectively have more children. The genetic effects are likely weak here but the memetic effects are strong. A culture with social norms against contraception or for large families are likely to out-birth other groups.

These will by far be the strongest. See for example the birth rates of religious people versus anyone else.

These discussions all have the same problem. They misapprehend how slow evolution is. - Long before any such selection can take place, the human genome is going to get rewritten end to end by deliberate technological intervention. Or heck, people will just stop dying - universal survival means no selection.

This means that the only thing that matters for the persistence of any human trait is how much they are valued. Uhm. Including how much they are valued by already-modified humans. A few rounds of iteration on that theme and I can guarantee at least one thing about future humanity: They will be one hundred percent satisfied with their physical incarnation. (because otherwise, it'd get changed.)

Think it through. How long do you think it will take before we master genetic engineering and decide to use it? 50 years? 500? 5000? Because at datum 5000, evolution will have done bugger-all to the genome. I mean, lactose tolerance might be a bit more common... but overall? Tech is fast. Social and legal change is slower, but compared to evolution? Blindingly fast. And this is some weak-sauce selective pressures. Most people do have kids. Failing at contraception does not shift the lifetime number of children reliably upwards, it just fucks you over economically. And kids are expensive.

Small changes to genotype don't imply small changes to phenotype.

Evolution is slow. It takes generations. Depending on the selection pressure these may be quite few. Assume sexual drive were the only determining factor for reproductive fitness (which probably is a good approximation for some animals) and you introduce a 95% successful 'contraception' (e.g. a genetic modification to avoid reproduction - this has been done for mosquitoes) and guess how many generations it takes to work around it. Now humans use 95% reliable contraceptives - but their usage is regulated by complex processes so no simple analysis suffices (just think of the misinterpretaion of the baby-bust/pill-gap).

Additionally we don't have to limit us to genetic evolution. We could also consider memetic evolution - the one invoked somewhat imprecisely in point 4. Memes evolve faster. It could happen that meme-complexes joining birth-control and anti-science out-breed progress within few generations.

Sure after 500 years we'd likely have the technological means - if anyone is still interested in technology then. And for some 500 may be a more likely date than 50.

It takes many generations. Human generations are quite long.

Without a technological civilization, the oldtime pressures of hunger and violence will dominate everything else - Which in some ways favors various means of birth control. Because having 6 kids and having all of them die due to splitting available resources to many ways is not a successful strategy. Therefore, your projection only makes sense in a continuing technological civilization, in which case engineering happens.

And again. Most people have kids. Successful use of birth control allows you to control time and number of said kids, the mosquito analogy holds no water whatsoever, if you want to model the selective advantages / disadvantages of this, you are going to need extensive real world data over generations- and a computing model projecting forward, and you would still be making stuff up.

Therefore, your projection only makes sense in a continuing technological civilization, in which case engineering happens.

Agreed. But the speed of technology is estimated quite variably. And at least currently there are already ethical (read: memetic) constraints on applying technology to reproduction. So one could argue that the selection pressure is already doing its work.

you are going to need extensive real world data [for] projecting forward.

Agreed. What do you propose? Assuming it too complicate to contemplate?

.... Yes. I mean, if you want to do a phd's worth of work, there are existing datasets one could mine - but the time horizon (since the legalization of birth control) is so short and the social context regarding reproduction has been shifting so heavily during this period that any predictions you make would end up being barely guesses. Fortunately, the subset of plausible futures in which this matters is absurdly small. The world would essentially have to enter into technological and social stasis for many thousands of years, and well. Uhm. No.

The marching morons has a lot to answer for, really, since variations on this is an idea that crops up like weed, and it is a pretty absurd scenario.

The marching morons has a lot to answer for, really, since variations on this is an idea that crops up like weed,

This is kind of an relevant argument because it means this - despite my non-political phrasing - is really a political topic because the opinion coalition effects are possibly much stronger than any solid predictions to be had. Or to rephrase this: Any actual biological effect is outweight by memetic effects.

it is a pretty absurd scenario.

I beg to differ.

A statistical look at whether bike helmets make sense-- concludes that there are some strong arguments against requiring bike helmets, and that drivers give less room to cyclists wearing helmets.

The Amanda Knox prosecution saga continues: if the original motive does not hold, deny the need for a motive.

When I'm procrastinating on a project by working on another, sexier project, it feels exactly like a love triangle where all three participants are inside my head, with all the same pleading, promises and infidelities. I wish that told us something new about procrastination or love!

Why engineering hours should not be viewed as fungible-- increasing speed/preventing bottlenecks is important enough to be worth investing in. An example of how to be utilitarian without being stupid about it.

Any recommendations for discussions of how to figure out what's important to measure?