Imagine you are a Good Scientist. You know about p-hacking and the replication crisis. You want to follow all best practices. You want to be doing Good Science!

You're designing an experiment to detect if there's a correlation between two variables.

Your Good Scientist has gone off the rails already. Why do they want to know if there's a correlation between two variables? What use is a correlation?

I am not seeing where your Bayesian Scientist is doing any better. He's dropped p-values and adopted a prior, but he's still just looking for correlations and expressing results according to the Bayesian ritual instead of the Frequentist ritual. But nobody cares whether smokers tend to be taller or shorter than non-smokers. They care about whether smoking stunts growth. A Truly Good Scientist needs to be looking for causal structures and mechanisms.

Reply

[-]mukashi4y50

Looking for causal structures and mechanisms do entail (among other things) doing correlations. Would your critic still be valid if he had used a different example? He could have chosen anything else, the example was used to illustrate a point.

Reply

[-]Yair Halberstadt4y10

Exactly as mukashi was saying, the correlation is purely an example of something I want to find out about the world. The process of drawing inferences from correlations could be improved too, but that's a different topic, and not really relevant for the central point of this post.

Reply

[-]Richard_Kennaway4y40

The point I'm raising is independent of the example. "Looking for a correlation" is never the beginning of an enquiry, and, pace mukashi, is not necessarily a part of the enquiry. What is this Scientist really wanting to study? What is the best way to study that?

I work with biologists who study plants, trying to work out how various things happen, such as the development of leaf shapes, or the development of the different organs of flowers, or the process of building cell walls out of cellulose fibrils. Whatever correlations they might from time to time measure, that is subordinate to questions of what genes are being expressed where, and how biological structures get assembled.

Reply

[-]Yair Halberstadt4y10

That may be the case, but I think that is peripheral to the point of this post. If for some reason I wanted to find out the value of a variable (and this variable could be anything, including a correlation), how would I go about doing it.

Reply

[-]Richard_Kennaway4y20

I am taking the point of the post to be as indicated in the title and the lead: creating a model for doing Empirical Science. Finding out the value of a variable — especially one with no physical existence, like a correlation between two other variables — is a very small part of science.

Reply

[-]ChristianKl4y20

Finally in order to come to a conclusion via Bayesian updating, you first need a prior. The problem is that there's no practical objective way to come up with a prior (Solomonoff induction is not practical). This means that you can come to any conclusion you like based on the available evidence by choosing a suitable prior.

One way to deal with that would be to require scientists to specify their priors when registering their study setup. It doesn't increase the complexity of registering the study very much but it does give you a prior. It also allows you to look at all the studies by a given scientist to find out how well calibrated their priors are.

Reply

Moderation Log