[ Question ]

Under what circumstances is "don't look at existing research" good advice?

by Kaj_Sotala 1 min read13th Dec 201919 comments

71


In How I do research, TurnTrout writes:

[I] Stare at the problem on my own, ignoring any existing thinking as much as possible. Just think about what the problem is, what's confusing about it, what a solution would look like. In retrospect, this has helped me avoid anchoring myself. Also, my prior for existing work is that it's confused and unhelpful, and I can do better by just thinking hard.

The MIRI alignment research field guide has a similar sentiment:

It’s easy to fall into a trap of (either implicitly or explicitly) conceptualizing “research” as “first studying and learning what’s already been figured out, and then attempting to push the boundaries and contribute new content.”

The problem with this frame (according to us) is that it leads people to optimize for absorbing information, rather than seeking it instrumentally, as a precursor to understanding. (Be mindful of what you’re optimizing in your research!) [...]

... we recommend throwing out the whole question of authority. Just follow the threads that feel alive and interesting. Don’t think of research as “study, then contribute.” Focus on your own understanding, and let the questions themselves determine how often you need to go back and read papers or study proofs.

Approaching research with that attitude makes the question “How can meaningful research be done in an afternoon?” dissolve. Meaningful progress seems very difficult if you try to measure yourself by objective external metrics. It is much easier when your own taste drives you forward.

And I'm pretty sure that I have also seen this notion endorsed elsewhere on LW: do your own thinking, don't anchor on the existing thinking too much, don't worry too much about justifying yourself to established authority. It seems like a pretty big theme among rationalists in general.

At the same time, it feels like there are fields where nobody would advise this, or where trying to do this is a well-known failure mode. TurnTrout's post continues:

I think this is pretty reasonable for a field as young as AI alignment, but I wouldn't expect this to be true at all for e.g. physics or abstract algebra. I also think this is likely to be true in any field where philosophy is required, where you need to find the right formalisms instead of working from axioms.

It is not particularly recommended that people try to invent their own math instead of studying existing math. Trying to invent your own physics without studying real physics just makes you into a physics crank, and most fields seem to have some version of "this is an intuitive assumption that amateurs tend to believe, but is in fact wrong, though the reasons are sufficiently counterintuitive that you probably won't figure it out on your own".

But "do this in young fields, not established ones" doesn't seem quite right either. For one, philosophy is an old field, yet it seems reasonable that we should indeed sometimes do it there. And it seems that even within established fields where you normally should just shut up and study, there will be particular open questions or subfields where "forget about all the existing work and think about it on your own" ought to be good advice.

But how does one know when that is the case?

71

New Answer
Ask Related Question
New Comment

9 Answers

My field is theoretical physics, so this is where my views come from. (Disclaimer: I have not had a research position since finishing my PhD in General Relativity some 10 years ago.) Assuming you want to do original research, and you are not a genius like Feynman (in which case you would not be interesting in my views, anyway, what do you care what other people think?):

  • Map the landscape first. What is known, which areas of research are active, which are inactive. No need to go super deep, just get the feel for what is where.
  • Gain the basic understanding of why the landscape is the way it is. Why are certain areas being worked on? Is it fashion, ease of progress, tradition, something else? Why are certain areas being ignored or stagnate? Are they too hard, too boring, unlikely to get you a research position, just overlooked, or something else?
  • Find a promising area which is not well researched, does not appear super hard, yet you find interesting. Interdisciplinary outlook could be useful.
  • Figure out what you are missing to do a meaningful original contribution there. Evaluate what it would take to learn the prerequisites. Alternate between learning and trying to push the original research.
  • Most likely you will gain unexpected insights, not into the problem you are trying to solve, but into the reason why it's not being actively worked on. Go back and reevaluate whether the area is still promising and interesting. Odds are, your new perspective will lead you to get excited about something related but different.
  • Repeat until you are sure that you have learned something no one else has. Whether a question no one asked, or a model no one constructed or applied in this case, or maybe a map from a completely unrelated area.
  • Do a thorough literature search on the topic. Odds are, you will find that someone else tried it already. Reevaluate. Iterate.
  • Eventually you might find something where you can make a useful original contribution, no matter how small. Or you might not. Still, you will likely end up knowing more and having a valuable perspective and a skill set.

Physics examples: don't go into QFT, String theory or Loop quantum gravity. No way you can do better than, say, Witten and Maldacena and thousands of theorists with IQ 150+ and the energy and determination of a raging rhino. Quantum foundations might still have some low-hanging fruit, but the odds are against it. No idea about the condensed matter research. A positive example: Numerical relativity hit a sweet spot about 15 years ago, because the compute and the algorithms converged, and there were only a few groups doing it. Odds are something similar is possible again, just need to find where.

Also, Kaj, your research into multi-agent models of the mind, for example, might yield something really exciting and new, if looked at in a right way, whatever it is.

I basically disagree with the recommendation almost always, including for AI alignment. I do think that

The problem [...] is that it leads people to optimize for absorbing information, rather than seeking it instrumentally, as a precursor to understanding.

I often see the sentiment, "I'm going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I'll have the prerequisites to do AI safety". (Possible reasons for this: the 80K AI safety syllabus, CHAI's bibliography, a general sense that you have to be an expert before you can do research.) This sentiment seems wrong to me; you definitely can and should think about important questions before learning everything that could potentially be considered "background".

The advice

let the questions themselves determine how often you need to go back and read papers or study proofs.

sounds to me like "when you feel like existing research would be useful, then go ahead and look at it, but don't feel like it's necessary", whereas I would say "as soon as you have questions, which should be almost immediately, one of the first things you should do is find the existing research and read it". The justification for this is the standard one -- people have already done a bunch of work that you can take advantage of.

The main disadvantage of this approach is that you lose the opportunity to figure things out from first principles. When you figure things out from first principles, you often find many branches that don't work out, which helps build intuitions about why things are the way they are, which you don't get nearly as well by reading about research, and you can't go back and figure things out from first principles because you already know the answer. But this first-principles-reasoning is extremely expensive (in time), and is almost never worthwhile.

Another potential disadvantage is that you might be incorrectly convinced that a technique is good, because you don't spot the flaws in it when reading existing research, even though you could have figured it out from first principles. My preferred solution is to become good at noticing flaws (e.g. by learning how to identify and question all of the assumptions in an argument), rather than to ignore research entirely.

Side note: In the case of philosophy, if you're trying to get a paper, then I'm told you often want to make some novel argument (since that's what gets published), which makes existing research less useful (or only useful to figure out what not to think about). If you want to figure out the truth, I expect you would do well to read existing research.

TL;DR: Looking at existing research is great because you don't have to reinvent the wheel, but make sure you need the wheel in the first place before you read about it (i.e. make sure you have a question you are reading existing research to answer).

ETA: If your goal is "maximize understanding of X", then you should never look at existing research about X, and figure everything out from first principles. I'm assuming that you have some reason for caring about X that means you are willing to trade off some understanding for getting it done way faster.

IMO the correct rule is almost always: first think about the problem yourself, then go read everything about it that other people did, and then do a synthesis of everything you learned inside your mind. Some nuances:

  • Sometimes thinking about the problem yourself is not useful because you don't have all the information to start. For example: you don't understand even the formulation of the problem, or you don't understand why it is a sensible question to ask, or the solution has to rely on empirical data which you do not have.

  • Sometimes you can so definitively solve the problem during the first step (unprimed thinking) that the rest is redundant. Usually this is only applicable if there are very clear criteria to judge the solution, for example: mathematical proof (but, beware of believing you easily proved something which is widely considered a difficult open problem) or something easily testable (for instance, by writing some code).

  • As John S. Wentworth observed, even if the problem was already definitively solved by others, thinking about it yourself first will often help you learning the state of the art later, and is a good exercise for your mind regardless.

  • The time you should invest into doing the first step depends on (i) how fast progress you realistically expect to make and (ii) how much progress you expect other people to have made by now. If this is an open problem on which many talented people worked for a long time, then expecting to make fast progress yourself is unrealistic unless you have some knowledge to which most of those people had no access, or your talent in this domain is truly singular. In this case you should think about the problem enough to understand why it is so hard, but usually not much longer. If this is a problem on which only few people have worked, or only for a short time, or it is obscure so you doubt it got the attention of talented researchers, then making comparatively fast progress can be realistic. Still, I recommend proceeding to the second step (learning what other people did) once you reach the point when you feel stuck (on the "metacognitive" level when you don't believe you will get unstuck soon: beware of giving up too easily).

After the third step (synthesis), I also recommend doing some retrospective: what have those other researchers understood that I didn't, how did they understand it, and how can I replicate it myself in the future.

Trying to invent your own physics without studying real physics just makes you into a physics crank

This is demonstrably not (always) the case. Famously, Richard Feynman recommends that students always derive physics and math from scratch when learning. In fact his Nobel prize was for a technique (Feynman diagrams) which he developed on the fly in a lecture he was attending. What the speaker was saying didn’t make sense to him so he developed what he thought was the same theory using his own notation. Turns out what he made was more powerful for certain problems, but he only realized that much later when his colleagues questioned what he was doing on the whiteboard. (Pulled from memory from one of Feynman’s memoirs.)

One of the other comments here recommends against this unless you are a Feynman-level genius, but I think the causality is backwards on this. Feynman’s gift was traditional rationality, something which comes through very clearly in his writing. He tells these anecdotes in order to teach people how to think, and IMHO his thoughts on thinking are worth paying attention to.

Personally I always try to make sure I can derive again what I learn from first principles or the evidence. Only when I’m having particular trouble, or I have the extra time do I try to work it out from scratch in order to learn it. But when I do I come away with a far deeper understanding.

I suspect it's mostly proportional to the answer to the question "how much progress can you expect to make building on the previous work of others?" in a particular field. This is why (for example) philosophy is weird (you can make a lot of progress without paying attention to what previous folks have said), physics and math benefit from study (you can do a lot more cool stuff if you know what others know), and AI safety may benefit from original thinking (there's not much worth building off of (yet)).

I basically agree with Vanessa:

the correct rule is almost always: first think about the problem yourself, then go read everything about it that other people did, and then do a synthesis of everything you learned inside your mind.

Thinking about the problem myself first often helps me understand existing work as it is easier to see the motivations, and solving solved problems is good as a training.

I would argue this is the case even in physics and math. (My background is in theoretical physics and during my high-school years I took some pride in not remembering physics and re-deriving everything when needed. It stopped being a good approach for physics ca since 1940 and somewhat backfired.)

The mistake members of "this community" (LW/rationality/AI safety) are sometimes making is skipping the second step / bouncing off the second step if it is actually hard.

Second mistake is not doing the third step in a proper way, which leads to somewhat strange and insular culture which may be repulsive for external experts. (E.g. people partially crediting themselves for discoveries which are know to outsiders)

Depends what your goals are, of course. If your goal is "fundamental understanding", then what I usually go for is: read stuff until there's an interesting question that I can see on my own, and then think about that question without reading further, and keep thinking on my own until I'm mostly out of thought-provoking questions.

I think one important context for not reading the existing literature first is calibration. Examining the difference between how you are thinking about a question and how others have thought about the same question can be instructive in a couple of ways. You might have found a novel approach that is worth exploring, or you might be way off in your thinking. Perhaps you've stumbled upon an obsolete way of thinking about something. Figuring out how your own thinking process lines up with the field can be extremely instructional, and super useful if you want your eventual original work to be meaningful. At the very least, you can identify your own common failure modes and work to avoid them.

The fastest and easiest way to accomplish all this is by using a sort of research loop where you collect your own thoughts and questions, then compare them with the literature and try to reconcile the two, then repeat. If you just read all the literature first, you have no way to calibrate your explorations when you finally get there.

I think this is mainly a function of how established the field is and how much time you're willing to spend on the subject. The point of thinking about a field before looking at the literature is to avoid getting stuck in the same local optima as everyone else. However, making progress by yourself is far slower than just reading what everyone has already figured out.

Thus, if you don't plan to spend a large amount of time in a field , it's far quicker and more effective to just read the literature. However, if you're going to spend a large amount of time on the problems in the field, then you want to be able to "see with fresh eyes" before looking at what everyone else is doing. This prevents everyone's approaches from clustering together.

Likewise, in a very well established field like math or physics, we can expect everyone to already have clustered around the "correct answer". It doesn't make as much sense to try and look at the problem from a new perspective, because we already have very good understanding of the field. This reasoning break down once you get to the unsolved problems in the field. In that case, you want to do your own thinking to make sure you don't immediately bias your thinking towards solutions that others are already working on.