Sorted by New

Wiki Contributions


"For unsolved problems, you can only find the correct causative variable with a "firehose of information." Then you can go on to prove you're right via a properly controlled experiment."


That second part often doesn't happen. For [bio]medical experiments it is just too expensive.  Datamning ensues and any significant p value variables are then published.  The medical journals are rife with this which is one reason 30-50% of medical research proves unrepeatable.

Never underestimate human nature to do the easiest thing rather than the correct one.  Science can be painstakingly hard to get right, but the pressures to publish are high.  I've seen it first hand in biotech, where the obvious questions to ask of the "result" were ignored.  

The measurement of lots of other things leads to the pathology of data mining rather than trying to find the correct causative variable.  The better experimental technique is to sequentially investigate each confounding variable and try to ensure they are eliminated.  Sometimes this can be hard, but that is no excuse not to do properly CONTROLLED experiments rather than reporting noise.

Data mining is so problematic that medical journals have insisted that the experiment hypothesis is defined in advance so that unexpected variables with significant p-values are not reported instead (p-hacking).  

I would far rather experimenters do the 1-bit experiment and then if the result doesn't falsify the hypothesis, think about other explanations for the result and check those variables in the same way.  Good experimentation is not for the lazy.