Related: Science: do it yourself, Some Heuristics for Evaluating The Soundness of the Academic Mainstream in Unfamiliar Fields, The Neglected Virtue of Scholarship

There's been some recent discussion about the value of answering your questions by reading the established scholarly literature in a field.  On the one hand, if you never read other researchers' work, you'll be trying to invent the wheel, and you may well get things very wrong.  The best way to learn physics is to pick up a physics textbook, not to try to deduce the laws of nature on your own.  On the other hand, sometimes the leading scientists are wrong.  Or sometimes you have a question that has never been studied scientifically.  It's not always enough just to look at what the experts say; sometimes you need a little independent thought, or even the "DIY science" approach of looking at the data yourself and drawing your own conclusions.  Ideally, these two approaches aren't really rivals, they're complementary.  "DIY science" gives you insights into how to conduct scholarship -- what resources to seek out, and what claims to trust.

Looking up other people's research is scholarship, not science. Scholarship isn't bad. In most fields scholarship is useful, and in technical fields it's a prerequisite to doing science. But scholarship has its drawbacks.  If there is no high-quality body of scientific research on a question (say, "How do people get very rich?") then looking up "expert" opinion isn't especially useful. And if there is a body of scientific research, but you have reason to suspect that the scientific community isn't well-informed or evenhanded, then you can't just rely on "experts."  But you don't want to be mindlessly contrarian either.  You need some independent way to evaluate whom to believe.

Take a topic like global warming.  Is the scientific literature accurate? Well, I don't know. To know the answer, I'd have to know more about geophysics myself, be able to assess the data myself, and compare my "DIY science" to the experts and see if they match. Or, I'd have to know something about the trustworthiness of peer-reviewed scientific studies in general -- how likely they are to be true or false -- and use that data to inform how much I trust climate scientists. Either way, to have good evidence to believe or not believe scientists, I'd need data of my own.

The phrase "DIY science" makes it sound like there's some virtue in going it alone. All alone, no help from the establishment. I don't think there's any virtue in that. Help is useful! And after all, even if you tabulate your own data, it's often data that someone else gathered. (For example, you could learn how people become billionaires by compiling stats on Fortune 500 lists.)  This isn't My Side of the Mountain science. It's not idealizing isolation.  The "do it yourself" step is the step where you draw your own conclusion directly from data.  You don't have to neglect scholarship -- you just have to think for yourself at some point in the process.

The trouble with doing all scholarship but no science is that you have no way to assess the validity of what you read. You can use informal measures (prestige? number of voices in agreement? most cited? most upvotes?) but how do you know if those informal measures correlate with the truth of an argument? Eventually, at some point you have to look at some kind of data and draw your own conclusion from it. Critically assessing scientific literature eventually requires you to do some DIY science. Here, let's look at the data section. Do the paper's conclusions match their actual data?  Here, let's look at this discipline's past record at predicting future events. Does it have a good track record?  Here, let's look at these expert recommendations.  How often are they put into practice in the real world, and with what success?

One way of uniting scholarship and DIY science is the category of metastudies -- how often is such-and-such class of experts or publications correct?

Political pundits' predictions are hardly better than random, and educational credentials don't help accuracy.

As much as 90% of the published medical information that doctors rely upon may be wrong.

To get really meta: most CDC meta-analyses are methodologically flawed.

Another thing you can do to unite scholarship and DIY science is looking for patterns, commonalities and disagreements in existing literature.  Thanks to modern large databases, you can sometimes do this statistically.

IHOP searches the PubMed literature by genes or proteins.

Google ngrams allow you to search word prevalence to get a rough idea of trends in writing over time.

The digital humanities are a new field that involves running somewhat more subtle statistics on historical and scholarly documents.

If you want to attach a number to "scientific consensus," look at the Science Citation Index.  Many of these tools are restricted to universities, but you can, for example, measure how highly cited a journal or an author is.

EDIT DUE TO MORENDIL: it's possible, using graph theory alone, to notice misleading information in a scientific subfield by looking at the pattern of citations.  Looking at the citation graph, it's possible to observe bias (a systematic tendency to under-cite contradictory evidence), amplification (the prevalence of a belief in the scientific community becomes stronger due to new links to a few influential and highly-linked papers, even though no new data is being presented), and invention (the propagation of claims that are not actually backed by data anywhere.) Negative results (the lack of conclusive results) systematically fail to propagate through scholarly networks.  

We'd need more sophisticated data analysis than now exists, but one of my dreams is that one day we could develop tools that search the existing literature on a search term, say, "Slavery caused the American Civil War," and allowed you to estimate how contentious that claim was, how many sources were for and against it, what the sources' citation rates and links to other phrases tell you about who holds what opinions, and allowed you to somewhat automate the process of reading and making sense of what other people wrote.  

An even more ambitious project: making a graph of which studies invalidate or cast doubt on which other studies, on a very big scale, so you could roughly pinpoint the most certain or established areas of science. This would require some kind of systematic method of deducing implication, though.

This can get more elaborate than you might prefer, but the point is that if you really want to know how valid a particular idea you've read is, there are quantitative ways to get closer to answering that question.  

The simplest check, of course, is the sanity test: run some simple figures to see if the "expert" view is even roughly correct.  Does past stock performance actually predict future stock performance?  Well, the data's out there. You can check.  (N.B. I haven't done this.  But I would if I were investing.)  Is cryonics worth the money?  Come up with some reasonable figures for probability and discount rates and do a net present value calculation.  (I have done this one.)  Or consider blatantly unscientific data-gathering: a survey on when and how authors get published.   Online polls and other informal data-gathering are bad methodology, but way better than nothing, and often better than intuitive "advice." If the expert "consensus" is failing your sanity tests, either you're making a mistake or they are.

A recurring question is how and when to be contrarian -- when to reject the mainstream expert judgments.  One obvious example is when you suspect experts are biased to protect interests other than truth. But experts can be biased while still being right.  (For example, biologists certainly have systematic biases, but that doesn't mean their embrace of the theory of evolution is the result of irrational bias.)  One way or another, you're stuck with a hard epistemic problem when evaluating claims. You can test them against your own data  -- but then you have to decide how much you trust your own back-of-the envelope computations or informal data collection.  You can test them against the "consensus" or bulk of the literature -- but then you have to decide whether you trust the consensus.  You can test them against the track record of the field -- but then you have to decide whether you trust the very meta-analysis you're using.  There's no single magic bullet.  But it's probably worth it, if you're seriously curious about the truth of a claim, to try a few of these different approaches.

 

New to LessWrong?

New Comment
20 comments, sorted by Click to highlight new comments since: Today at 3:27 AM

One set of heuristics I use...

"You claim to be an expert. Can you

  1. Explain things that are not explicable in other ways (where the explanation is simpler than the result)?

  2. Fix broken things so they work again?

  3. Design and/or build things or perform procedures that have unusually useful results?

  4. Predict the future (with predictions documented in advance)?

  5. Do you test your theories empirically (-1 if the standard of truth is simple 'consensus')?"

As a case study, consider economics. I would give economists 0/5 on this test. Freudian psychoanalysis would also get minus 1 out of 5.

On global warming, on my reading I would give the received wisdom 4/5 and the sceptics 0/5.

[-][anonymous]13y60

I get very different results using the same heuristics. They may be overly vague, or too open to interpretation. It's not clear that they're useful for much other than rationalization.

Here, I'll apply it point by point to microeconomics.

1: For example, microecon explains prices, essentially correctly, as arising out of supply and demand.

2: Adam Smith and many other economists have offered valuable and true advice about how to fix the economy, advice which worked repeatedly. Advice such as opening foreign trade.

3: Isn't this just a variation on "fix"? See 2.

4: Economists can easily predict the results of various policies, such as, to give one instance, price controls. To give another, the establishment of monopolies.

5: Microecon has been put to the empirical test repeatedly - see 4. A satisfied prediction is a passed empirical test.

Are these points, any or all, debatable? I don't think so. But the fact is you got 0/5 and I got 5/5. Makes me wonder about the real value of the heuristics as something other than rationalization.

It makes sense to consider microeconomics separately from macroeconomics, as the results are quite different. Maybe the heuristics are OK; you just need to clearly identify the field of study.

I was thinking mostly of macro-economics, but here goes:

1: For example, microecon explains prices, essentially correctly, as arising out of supply and demand.

Actually only in certain quite limited contexts. For example it has very little success in explaining prices in asset markets. In situations of monopoly or oligopoly it has little explanatory power - game theory is more relevant.

The situations of perfect competition, and perfect free knowledge of price and quality it does work but such situations are rarer than most economists seem to think.

Any even worse, blindness to the limitations of their theories leads to an inability to see the facts eg the unwillingness to notice market bubbles.

[good] Advice such as opening foreign trade.

The theory behind this basically assumes a static form of comparative advantage, that all players are time discounting profit maximizers, etc. Again economists are blind to the failure of free trade when its consequences are adverse eg the deskilling of the US workforce and the jobs that are never coming back, and when the assumptions are not met eg when countries employ protectionist policies strategically in support of objectives that go beyond time-discounting profit maximization.

4: Economists can easily predict the results of various policies, such as, to give one instance, price controls.

Economic theory would have us believe that price controls cause a mismatch between supply and demand and the development of shortages. This is often not the case eg when suppliers are making profits in excess of their costs.

More generally economics as it is practised is generally blind to unmet assumptions, distributional effects, institutional effects and irrational behavior.

I think one very important principle you ignore here are the implications of what might be called the generalized weak efficient markets hypothesis. When it appears that DIY research based on publicly available information offers a potential for huge gains, the first thing one should ask oneself is why everyone (or at least a large number of smart and capable people) isn't already doing the same thing with obvious success. Often this is enough to write off the idea as a priori unworthy of consideration. Thus, for example, it can be safely concluded without further consideration that it's an infeasible idea to try getting rich based on studying the publicly available information about billionaires, or to improve one's investment strategy by studying the behavior of the stock market.

There are of course exceptions, but they're very few and far between, so one should embark on a project that contradicts this principle only if there is some very good indication that it might be an exceptional situation. (For example, it might be that the idea really is too clever to have ever occurred to anyone, or there might be some systematic biases limiting its adoption, or it might be that everyone is in fact already doing it, only you've been oblivious about it so far. However, for most people, only the third thing is less than extremely rare.)

The principle is very general, in my opinion useful in a great multitude of cases and with a whole bunch of interesting implications. I'll consider writing an article about it.

If it were possible to learn to how to get rich by studying people who had done so, this does not imply that it would be easy to get rich, or that people we can observe getting rich have obviously learned to do so in this way.

  • Figuring out how to synthesize the data is work.
  • Actually synthesizing the data, having figured out how do so, is work.
  • Implementing what you have learned to actually get rich is likely to be long-term, hard work.

Generally, the effecient market hypothesis does not imply that it is impossible to find opportunities to do work to create value. It implies that you are not likely to find opportunities to get much better than a typical return of value for work.

JGWeissman:

Generally, the effecient market hypothesis does not imply that it is impossible to find opportunities to do work to create value. It implies that you are not likely to find opportunities to get much better than a typical return of value for work.

I agree. In fact, the principle can be refined even further: you're not likely to find opportunities much better than a typical career path accomplished by people whose abilities and qualities are comparable to yours. So if you think you've found a great opportunity, it's not at all impossible that you're correct, but it does mean that you're either very lucky or exceptionally capable.

These points may seem trivial, but in reality, I'm baffled with how many people don't seem to understand them. The most obvious example are all those people who keep insisting that they can beat the stock market, but I've seen many others too.

I agree with your general principle

There are of course exceptions, but they're very few and far between, so one should embark on a project that contradicts this principle only if there is some very good indication that it might be an exceptional situation. (For example, it might be that the idea really is too clever to have ever occurred to anyone, or there might be some systematic biases limiting its adoption, or it might be that everyone is in fact already doing it, only you've been oblivious about it so far. However, for most people, only the third thing is less than extremely rare.)

The factor I focus on when considering the potential for exception is money. If there was an exception would have given participants a way to get rich (or powerful or laid) then I assume that it would have been considered already. So I wouldn't even bother looking at stock market 'exceptions', for example. I would look at medical 'exceptions' so long as the standard variant was the one with more financial success.

if you really want to know how valid a particular idea you've read is, there are quantitative ways to get closer to answering that question.

The ultimate in quantitative analysis is to have a system predict what your opinion should be on any arbitrary issue. The TakeOnIt website does this by applying a collaborative filtering algorithm on a database of expert opinions. To use it you first enter opinions on issues that you understand and feel confident about. The algorithm can then calculate which experts you have the highest correlation in opinion with. It then extrapolates what your opinion should be on issues you don't even know about, based on the assumption that your expert agreement correlation should remain constant. I explained the concept in more detail a while ago on Less Wrong here, but have since actually implemented the feature. Here are TakeOnIt's predictions of Eliezer's opinions. The more people add expert opinions to the database, the more accurate the predictions become.

Note that the website currently requires you to publicly comment on an issue in order to get your opinion predictions. A few people have requested that you should be able to enter your opinion without having to comment. If enough people want this, I'll implement that feature.

one of my dreams is that one day we could develop tools that ... allowed you to estimate how contentious that claim was, how many sources were for and against it... and links to ... tell you about who holds what opinions, and allowed you to somewhat automate the process of reading and making sense of what other people wrote.

That's more or less the goal of TakeOnIt. I'd stress that the biggest challenge here is populating the database of expert opinions rather than building the tools.

An even more ambitious project: making a graph of which studies invalidate or cast doubt on which other studies, on a very big scale, so you could roughly pinpoint the most certain or established areas of science. This would require some kind of systematic method of deducing implication, though.

Each issue on TakeOnIt can be linked to any other issue by adding an "implication" between two issues. Green arrows link supporting positions; red arrows link contradictory positions. So for example, the issue of cryonics links to several other issues, such as the issue of whether information-theoretic death is the most real interpretation of death (which if true, supports the case for cryonics).

[-][anonymous]13y10

I remember TakeOnIt and I like the principle. The downside is that the opinions and relationships have to be put in by hand, which means that it'll take time and work to fill it up with enough experts to really model the whole body of expert opinion. But it's a great site.

This talk about scholarship and DIY science reminds me of the classical distinction between empiricism and rationalism in philosophy. It seems that we've rightly identified this empiricism/rationalism thing as a false dichotomy here on LW. I'm glad to see the same happening for scholarship and DIY science, now too.

As a general comment based on my own experience, there is an enormous value in studying existing art to know precisely what science and study has actually been done -- not what people state has been done. And at least as important, learning the set of assumptions that have driven the current body of evidence.

This provides an enormous amount of context as to where you can actually attack interesting problems and make a difference. Most of my personal work has been based on following chains of reasoning that invalidated an ancient assumption that no one had revisited in decades. I wasn't clever, it was really a matter of no one asking "why?" in many years.

An even more ambitious project: making a graph of which studies invalidate or cast doubt on which other studies

I found this article very interesting: "How citation distortions create unfounded authority: analysis of a citation network " - http://www.bmj.com/content/339/bmj.b2680.full

[-][anonymous]13y10

You could do worse than stick to scientists that work for companies in a competitive industry. A favorite example of mine is chemists who work for Dupont or some such company.

My reasoning is this: if a chemist is bad and he is designing, say, a new glue, then his glue won't stick and he'll be fired. If he is not fired because his boss is stupid then his boss will be fired then he'll be fired. If the president of the company won't fire him and his boss, then the customers won't buy the glue (because it doesn't stick) and the company will go bankrupt and the chemist and his boss and president will lose their jobs. If the customers buy the glue then they'll have accidents and their ability to keep up their stupidity will be diminished, and meanwhile, smart people who are able to tell whether glue sticks will pay their money to companies that hire competent chemists.

In short, then, one way or another, the bad chemist will lose his job and the good chemist will get a raise. This assumes a competitive industry. In a monopolistic bureaucracy, all bets are off. In a regulated industry all bets are off.

This gets completely past any reliance on a human, individual or collective, arbiter of truth, such as "consensus". If "consensus opinion" is wrong, i.e., if majority opinion is wrong, then the majority will gradually be fired, and/or lose business, and/or subject itself to harm, and so the majority will decrease, and the minority which is not wrong will increase and become the majority. The arbiter of truth here is not "consensus" but natural selection (of a sort).

At any given time, this process has already been well underway for a long time. So, for example, we can be sure that right now, the majority of chemists employed by companies in competitive industries know what they're talking about. We can generalize to scientists generally: a company that is managing to survive in a competitive, unregulated industry isn't going to hire a scientist if the scientist doesn't add value at least equivalent to his salary. It can't afford to. So the scientist has to know his stuff.

We can extend this to those schools which train scientists who work in industry. They have to teach valid science, they can't teach garbage.

[-][anonymous]13y10

For chemists making glue, fine. The stickiness of glue is obvious enough to everyone that making sticky glue will work better than employing scientists to claim that your glue is sticky.

For things like pharmaceuticals, tobacco, etc, you're far more likely to encounter distortion.

Yes, if the companies are just competing on brands of otherwise identical tobacco products, then we should expect uniform bias; and that's what we see.

But the pattern of bias in pharmaceuticals surprises me. One might expect that competing companies would be biased towards their own products. If that were so, we could extract unbiased estimates by comparing across drug companies (at least for patented drugs). But that's not what we see. There might be a small bias towards their own drugs, but it is swamped by a large bias towards in-patent drugs, regardless of owner, against off-patent drugs.

Yes, I can think of explanations, like that they are cooperating in not using up the public good of FDA credulity, but this isn't what I would have predicted ahead of time.

[-][anonymous]13y-10

That surprises me too. Do you have a citation for it?

In fact, I'm surprised that drug companies do studies on each other's drugs often enough that the effect can be discerned.

I don't mean to imply that I have a good grasp on the biases, just that they are surprising. The particular effect with patents happened with SSRIs, that they fell apart as their patents expired; I probably imply too much generality.

Drug companies study each others' drugs all the time, because FDA approval of a particular drug requires the claim that it is better, at least for some population. A typical phase 3 study compares the company's own drug, a similar recent drug, and the standard treatment that the two drugs are trying to displace.

[-][anonymous]13y-10

Maybe the bias there is in expecting that the competitor's drug is similar to their own (therefore also good), and that it's newer than the standard treatment (and so more advanced and better).

[-][anonymous]13y00

Your first link in the related section isn't reaching across the whole title i.e. Science: do it yourself.

Just a minor editing error, but it did catch my attention in the wrong spot :)