Literature Review For Academic Outsiders: What, How, and Why

by namespace14 min read9th May 20205 comments

167

Scholarship & LearningWorld Modeling
Curated

A few years ago I wrote a comment on LessWrong about how most authors on the site probably don't know how to do a literature review:

On the one hand, I too resent that LW is basically an insight porn factory near completely devoid of scholarship.

On the other hand, this is not a useful comment. I can think of at least two things you could have done to make this a useful comment:

Specified even a general direction of where you feel the body of economic literature could have been engaged. I know you might resent doing someone elses research for them if you’re not already familiar with said body, but frankly the norm right now is to post webs spun from the fibrous extrusions of peoples musing thoughts. The system equilibrium isn’t going to change unless some effort is invested into moving it. Notice you could write your comment on most posts while only changing a few words.
Provide advice on how one might go about engaging with ‘the body of economic literature’. Many people are intelligent and reasonably well informed, but not academics. Taking this as an excuse to mark them swamp creatures beyond assistance is both lazy and makes the world worse. You could even link to reasonably well written guides from someone else if you don’t want to invest the effort (entirely understandable).

I also linked a guide from Harvard's library (Garson & Lillvick, 2012) on how to do a literature review. But this guide makes extensive use of flash video, which makes it increasingly hard to access the content. Even if flash was alive and well, video is not necessarily the most comfortable format. Worse still, I remember feeling there was a great deal of tacit knowledge excluded from the guide which wouldn't be apparent to someone that isn't already familiar with academic culture. Even if the guide was a perfect representation of how to do an academic literature review, the priorities and types of work put together by LessWrong authors are more outsider science (Dance, 2008) than they are Harvard. For this reason I've had writing a guide to literature review aimed towards academic outsiders on my to-do list for a while.

At the same time I'm not interested in reinventing the wheel. This guide is going to focus specifically on filling in the knowledge gaps I would expect from someone who has never stepped foot inside a college campus. The other aspects have been discussed in detail, and where they come up I'll link to external guides.

What is a literature review?

'Literature review' the process is a way to become familiar with what work has already been done in a particular field or subject by searching for and studying previous work. A 'literature review' is a document (often a small portion of a larger work) which summarizes and analyzes the body of previous work that was encountered during literature review, often in the context of some new work that you're doing.

Why do literature review?

Literature reviews tend to come up in two major contexts: As a preliminary study to help contextualize a novel work, or as a work itself to summarize the state of a field or synthesize concepts to create new ideas. Most of my research falls into the latter category, I'm a big fan of putting together existing evidence and ideas to synthesize models (namespace, 2020). Gwern also tends to do work in this style (Branwen, 2020). I suspect that a lot of authors on LessWrong are attempting to do this, but fail to really say anything useful because they haven't figured out how to incorporate thorough evidence into their argument. When I did a review of all my notes from 2015, I found the number one failure mode I'd fall into was not paying attention to prior art. This was because I did not have heuristics like:

  • If it's hard to write about or you get stuck, you should probably do more research
  • If I want to write a post on something and I haven't checked the relevant literature for it yet I should probably do that as part of writing the post
  • Encountering or generating a cool mental model (Constantin, 2018) is a useful cue to consult the literature
  • If I'm trying to deal with a hard technical problem I should look at what work has already been done

The Benefits Of Literature Review

Literature review provides many benefits, such as:

  • Build Off The State Of The Art: Unless you make it a habit to look at what work already exists on a subject, you'll say what others have already said and do what others have already done. Your cognition is slow and expensive, and that makes leveraging the work of others extremely valuable. It is tempting to think that the established experts are idiots and you can beat them all with your own cleverness. Sometimes, this is actually true (Harford, 2019) but it's not something you should be counting on as a rule. Some literatures are mud moats (Smith, 2017), but other literatures are priceless treasures. Without access to the mathematics literature you would need to be a prodigy like Ramanujan to make new contributions. In my 2015 notes there was an episode where I tried designing a package manager. I filled many pieces of paper with thoughts on resolving dependency conflicts. Never did it occur to me to look at what methods were already used by existing systems like .deb or .rpm, let alone research papers that might tell me about theoretical methods.
  • Providing Context: Cultural artifacts exist in some kind of context, historical, social, or intellectual. Without provenance a 5,000 year old sword is just a rusty piece of metal (Giuliani-Hoffman, 2020). The same principle applies to intellectual work, without a justifying context artifacts are parsed as garbage (Foddy, 2017). The literature can help you provide context for your ideas and ground them in something other than just your personal experience.
  • Learn From The Mistakes Of Others: Bismarck famously remarked that fools learn from experience and wise men learn from the mistakes of others. Even if previous work has failed to make significant progress it can often serve as a reference of promising-sounding ideas that won't work. This familiarity is often a crucial component of the 'cleverness' that sets you apart from others. The Wright Brothers were very familiar with the established work on aerodynamic theory (Benson, 2014). Their rapid-iteration approach to airplane design quickly revealed that real world test flights defied their expectations, leading them to develop a new way to measure the performance of airplane parts. Once this was done the data enabled them to easily invent the airplane. Without that starting data to work from, it would have taken the Wrights longer to realize that data was the bottleneck to making an airplane.
  • Common Language: Scholars develop a shared language to discuss their studies. These vernaculars are a key marker of group membership (Hossenfelder, 2016). Authors that use the right words generally have standing and authors that use their own ad-hoc vocabulary are generally considered cranks. Even beyond credibility, writing in the standard language used by other authors makes it more likely you'll get expert feedback on your work.
  • Unknown Unknowns: Until you go looking, you often just plain don't know what you don't know about a subject. For example in my essay on fuzzies and saddies (Zealot, 2020) I didn't know that literature on morale was relevant to the research question until I started looking at the psychology of soldiers. Often when you start looking at previous work you have a "wait this exists?" moment that significantly alters the way you approach it.

Literature Review As Accessible Contribution

One question I hear often is: "How can I contribute to the rationality project without institutional resources?". Literature reviews are an accessible contribution that builds skills. Some of the best posts on LessWrong are literature reviews. The research skills that you build while doing it are extremely valuable, and will help you in most things you might want to pursue. It doesn't require very much money, and can be performed from the comfort of your home. All these traits make it nearly ideal for people who want to contribute but don't have a lot of resources, or who have to spend most of their time on school or work. Literature review does take time however, so like any volunteer work it's necessary that the person undertaking it have spare time and energy to work with. If this sounds interesting to you, feel free to private message me on LessWrong or join this blog's Discord server and I'll do my best to help.

The Document Universe

As a phenomenological definition, the document universe is the set of artifacts which are easy to access inside of academic review spaces like museums, libraries, reading/viewing rooms, or a home office. It is the spatial environment in which the literature exists. Learning to navigate this environment is essential to getting good at literature review.

People Are Documents Too

When you want to know more about a subject but aren't sure where to begin, the classic advice is to ask a librarian. Human beings are a key part of the document universe. They are intentionally created artifacts that contain knowledge, and that knowledge is backed by a full general intelligence. It's no coincidence that Socrates didn't like writing. People are arguably the most important part of the document universe. Knowledge does very little if it isn't contained inside someone.

Because of their high value and short shelf life it can be hard to get access to knowledgeable people (Hossenfelder, 2016). It's not impossible however (Dance, 2008), the received wisdom is that most scholars are eager to discuss their work so long as you respect their time. Eric Raymond's classic essay (Raymond, 2014) on asking good questions is oriented more towards "How do I X with program Y?" type queries, but with some mental rearranging applies just as well to plenty of other queries. For academic questions in particular it's important that you do your best to understand the science and understand the language used by the science (Hossenfelder, 2016). Failure to do that is likely to get you spam filtered as a crank.

Academic Sources Are Underadvertised

Most web users don't seem to be aware of academic sources. I remember when I was younger feeling a vague malaise as I browsed the Internet, because all the knowledge seemed to be diffuse and informal. When I read books it was clear that they were high quality sources of knowledge, but the Internet felt barren of that. It turned out this was mostly just because I was looking in the wrong place. The academic section of the document universe is publicly indexed by Google Scholar which makes it much easier to find high quality sources on most subjects.

Traditional Bibliography

The vast majority of the history of scholarship happened before the existence of electronic computers, let alone widespread high-capacity networked electronic computers. That means the formal norms of scholarship evolved in an environment quite alien to our current era of cheap access and full text search. In this section we'll review some of their features in that context.

Citation Trees As Central Dogma Of Academia

In school you were probably told that you had to cite your sources, and that failing to do so was plagiarism. Plagiarism is usually defined as "stealing someone else's work without credit", but in the context of citations this definition is very misleading. Grade schools like the concept because it lets them clearly define how much copying is cheating, with the unfortunate side effect that smart kids categorize the practice as schoolhouse ritual rather than valuable technique. By contrast in a functional literature where works are written to be read academic citation norms provide a genealogy of ideas. These days we're pretty used to digital documents that directly reference other pages, videos, etc; but before the Internet was widespread academia alone had the benefit of author provided citations. Academic citation formats are platform agnostic. They're content addressed rather than location based, so the goal of an academic citation is to give you enough information to reliably locate a specific source in the document universe. This is why they tend to get so tedious. A book might have 12 editions with multiple authors and undergo a title change, and only one version contains the passage you reference. All the annoying details in citation formats were put there in response to bibliographic failures and lookup complications with simpler formats.

Within a single work citations provide context for readers and leads for further reading, but it's when you have a whole literature that the practice really shines. It becomes possible to follow citations backwards to see the progression of ideas, move horizontally to find related work, and use modern database systems to find work downstream that cites a document as an ancestor. The genealogy aspect of academic citations also improves the signal:noise ratio by eliminating unimproved duplicate work, and makes it easier to associate ideas with the authors that originated them. All of this makes the academic sections of the document universe much more pleasant to navigate than the informal universe of newspaper articles and blog posts. From a contributors standpoint there's also more security, the norms are built to get your ideas hooked into a network of associated work which future scholars will consult during their reviews (Hanson, 2007). Outside of that Eden it's possible your effort will just get lost in the noise.

Unfortunately because the web is a disaster (Binstock, 2012) we're not really liberated from citations by the presence of hyperlinks. In an ideal world the web would be content addressed so that if a source stopped providing a document it could be seamlessly served by a backup provider like the Internet Archive. Instead we address by location, so if the domain hosting this blog changes hands and they put up a new site all the links to my posts break. If I decide I don't want to pay hosting costs anymore, all the links break. If the servers have a technical malfunction even though they're technically still on in some dusty computer lab somewhere, all the links break. As you might imagine this happens a lot (Branwen, 2019), so it's not viable to rely on links to identify content. Traditional citations at least provide for the possibility that there is a second copy somewhere which can be found with a search engine. The most savvy netizens do their best to ensure (Branwen, 2019) there is a second copy somewhere. Because these problems are unlikely to be fixed any time soon, if you plan to write lasting content you had best get familiar with citation formats.

Library Science As Conceptual Foundation Of The Academic Document Universe

Underlying the usefulness of a citation tree is physical infrastructure which houses, indexes, and curates documents. This type of work has been traditionally performed under the moniker of library science, even if in recent times it has mostly been done by a distributed system of bloggers, cooperating scientists, server hosts, and for-profit firms like Google. The old systems still exist however, and they're the environment of adaption for the current academic tradition. This makes it useful to know the principles of traditional library science so that you can better model academic-document-space. I recommend the book The Intellectual Foundation Of Information Organization by Elaine Svenonius (Svenonius, 2000) to get that understanding. Published in 2000, it was written just before digital documents were set to disrupt the academic ecosystem. It captures the full powers of the old ways in amber.

The Intellectual Foundation… is a particularly useful book for the scholar because it is designed to be read by the designers of future library systems. This means that it focuses less on the details of particular designs (which we probably don't care about very much by this point) but more on the principles which an effective system should satisfy and the "why?" behind them. These principles define the territory which citations describe, and will help you grok certain aspects of traditional scholarship.

How To Do Literature Review

I'd be a hypocrite if I didn't bother to look at what others have already written about doing a literature review. This talk with Dr. Candace Hastings (Hastings, 2009) on doing literature review is decent, it spends a lot of time focusing on the way to use sources in your writing once you've found them. She also explains how you can use citation counts to find the most important scholars in the field you're looking at. Guidelines for writing a literature review by Helen Mongan-Rallis (Mongan-Rallis, 2018) is a well written page on this topic for academics.

Every time I do research I perform a simple thought experiment: assuming somewhere in the world exists evidence that would prove or disprove my hypothesis, where is it? I tend to visualize this as a shot of earth from space, and then 'zooming in' on the sense data that would show me what I want to know. The literalism of this visualization is important because it emphasizes the sensory basis of evidence. Things happen in the world, and artifacts of their presence are left over afterwards. Physical remnants, images captured by cameras and sketch artists, written observations. This 'object level' phenomenological universe is what you're trying to get information about by looking at the literature.

A key consequence of this is that 'the literature' is not always what's output by academics. If I was studying martial arts, I would be looking into the history of martial arts as it's practiced by martial artists, in whatever mediums they use to record and disseminate information. Memory is a human activity, and your first priority should be to find the most effective and relevant sources for whatever you're looking at.

For tips on actually finding sources on the Internet (commonly known as 'Google-Fu') I recommend Gwern's page (Branwen, 2020) on the subject.

When You Don't Know The Name of Your Literature, or Missing and Biased Literatures

One of the more pernicious problems for literature review can be not knowing the name of the relevant literature. I often find myself posing research questions where it isn't clear how I would find previous work. The inciting post (Hoffman, 2018) that convinced me to write this one is discussing a phenomena that seems unlikely to be studied by economists. If I was doing literature review as part of writing this post, I would ask myself "What does the universe look like where we had the world wars and then wartime mobilization never stopped?". Then, I would aggressively dig in to find decisive places where looking at what happened before and after the world wars would prove or disprove my thesis. It's not enough to identify two points and then draw a trend line, that's not what it looks like to thoroughly justify yourself (namespace, 2020). As a thoroughly justified hypothesis looms closer and closer to theory, arguing against it should begin to feel like your debate partner is reality itself.

For the specific problem of a literature you simply don't know the name of, your best bet is often to ask others. Many times I've wanted to post a Request For Literature (RFL) on LessWrong, but felt without context the concept wouldn't really make sense to most readers. Hopefully after publishing this I'll be able to link it for context and that won't be a problem.

I didn't know what literature to look at for my essay on Fuzzies and Saddies (Zealot, 2020), where the thesis is both outside the overton window and our current social reality. How do you look at the literature for something like that? Well, one of the benefits of living in a consistent universe (namespace, 2020) is that it can take a lot of effort to reliably censor all information that would point towards a real phenomena. Because our censorship is largely of the distributed kind based on social pressure, it's largely ad-hoc and doesn't hold up well against the historical record or clever inference. I took notes on how I found the book on Missionary Morale.

Example Research Session: Finding The Book On Missionary Morale

Research Question (roughly): What makes some people seem to derive satisfaction and utility from being put into hellish situations like WW1?

Immediate question: Where would I be able to find information relating to this question, where would it be recorded and how would it be framed?

Thought: "What about studies on how soldiers attitudes about war change after they've been to war? [Most soldiers will probably dislike it, but some do like it and this might be studied as a pathology]"

Search (Google Scholar): soldiers attitudes toward war

First result: Stouffer, S. A., Suchman, E. A., DeVinney, L. C., Star, S. A., & Williams Jr, R. M. (1949). The american soldier: Adjustment during army life.(studies in social psychology in world war ii), vol. 1.

Look for thing, find thing. Read through it some, then:

Observation: There's a chapter on morale, the thing I am researching, "motivation through suffering, being fueled by harrowing circumstances, asceticism, keeping spirits up in the face of a hostile universe" is very closely related to and overlaps with the study of morale. Therefore I can look at morale studies to get a better look at this subject.

Read through book's study on effect of exposure to combat on morale, realize that it doesn't seem to be very useful to me.

Principle of Pain: Why isn't this useful to me?

Answer: The thing causing the drop in morale is of the wrong structure, these studies are about exposure to short bursts of extreme stress and danger which is not the situation my audience will be encountering in their lives.

Principle of Balance: Okay then, what would be useful to me (be of similar circumstances)?

Constraint: Needs to be a population which it's likely there will be studies on.

Hypothesis: Military intelligence officers, since their job is closer to the research aspect of things while still being in a population whose morale will be studied.

Hypothesis: Spy morale, spies need to exist in a foreign place pretending to be someone they're not while their real job is to do something else which is adversarial to the people in their immediate environment. The sort of alienation and lack of belonging that causes seems like a probable fit for how it actually feels to be researching things that only you care about in your immediate environment over a long period of time while at a deep cultural gulf between yourself and the people around you.

Principle of Exhuastion: Ph.D burnout, paratrooper morale (esp. if there are cases where single paratroopers are dropped into an area and have to be on their own, snipers?), Evangelical/Missionary morale/burnout, Wilderness survival/etc morale

I go look up stuff on spy morale, forgot to take notes during this.

Observation: MICE to RASCALS talks about 'operational psychology', which might have material on agent attrition and factors relating to it.

Principle of Balance: What do counterintelligence officers do to dissuade potential spies?

Observation: Undercover police work involves similar stuff in a domestic context which is less secret than international espionage.

Find paper on undercover police work, read it.

Principle of Pain: This still isn't quite what I want, because the point here is to condition an officer to play a role which they then need to be pulled out of later without too much damage. Though I guess that could be relevant, it's just not the core of the thing.

Decide to move on and look at missionaries.

Search (Google Scholar): missionary morale

Read the first search result, which is a book from 1920 on literally this subject (Miller, 1920).

Bibliography

  1. Garson, D., & Lillvik, C. (2012). The literature review: A research journey. Research guides at Harvard Library. https://guides.library.harvard.edu/c.php?g=310271&p=2071512
  2. Dance, A. (2012). Outsider science. Symmetry Magazine. https://www.symmetrymagazine.org/article/marchapril-2008/outsider-science
  3. namespace. (2020, February 1). "Memento mori", Said the confessor. The Last Rationalist. http://thelastrationalist.com/memento-mori-said-the-confessor.html
  4. Branwen, G. (2020, May 8). Embryo selection for intelligence. https://www.gwern.net/Embryo-selection
  5. Constantin, S. (2018, December 14). Player vs. character: A two-level model of ethics. LessWrong. https://www.greaterwrong.com/posts/fyGEP4mrpyWEAfyqj/player-vs-character-a-two-level-model-of-ethics
  6. Harford, T. (2019, August 14). The penny post revolutionary who transformed how we send letters. BBC News. https://www.bbc.com/news/business-48844278
  7. Smith, N. (2017, May 15). Vast literatures as mud moats. Noahpinion. https://noahpinionblog.blogspot.com/2017/05/vast-literatures-as-mud-moats.html
  8. Giuliani-Hoffman, F. (2020, March 25). 5,000-year-old sword is discovered by an archaeology student at a venetian monastery. CNN Style. https://www.cnn.com/style/article/5000-year-old-sword-discovered-in-italy-trnd/index.html
  9. Foddy, B. (2017). Getting over it with bennett foddy [Desktop & Mobile video game]. Humble Bundle: Bennett Foddy.
  10. Benson, T. (2014, June 12). Overview of wright brothers discoveries. Re-Living the Wright Way. https://wright.nasa.gov/discoveries.htm
  11. Hossenfelder, S. (2016, May 19). The holy grail of crackpot filtering: How the arXiv decides what’s science – and what’s not. Backreaction. https://backreaction.blogspot.com/2016/05/the-holy-grail-of-crackpot-filtering.html
  12. Zealot, E. (2020, April 21). Fuzzies and saddies part one: X-risk and motivation. The Last Rationalist. https://www.thelastrationalist.com/fuzzies-and-saddies-part-one-x-risk-and-motivation.html
  13. Hossenfelder, S. (2016, August 11). What i learned as a hired consultant to autodidact physicists. Aeon Ideas. https://aeon.co/ideas/what-i-learned-as-a-hired-consultant-for-autodidact-physicists
  14. Raymond, E.S., & Moen, R. (2014, May 21). How to ask questions the smart way. http://www.catb.org/~esr/faqs/smart-questions.html
  15. Hanson, R. (2007, July 17). Blogging doubts. Overcoming Bias. http://www.overcomingbias.com/2007/07/blogging-doubts.html
  16. Binstock, A. (2012, July 10). Interview with alan kay. Dr Dobb's. https://www.drdobbs.com/architecture-and-design/interview-with-alan-kay/240003442
  17. Branwen, G. (2019, January 5). Archiving URLs. https://www.gwern.net/Archiving-URLs
  18. Svenonius, E. (2000). The intellectual foundation of information organization. The MIT Press.
  19. Hastings, C. (2009, September 25). Get lit: The literature review. YouTube. https://www.youtube.com/watch?v=9la5ytz9MmM
  20. Mongan-Rallis, H. (2018, April 19). Guidelines for writing a literature review. https://www.d.umn.edu/~hrallis/guides/researching/litreview.html
  21. Branwen, G. (2020, January 21). Internet search tips. https://www.gwern.net/Search
  22. Hoffman, B.R. (2018, May 23). There is a war. LessWrong. https://www.greaterwrong.com/posts/DtS6x5r54sEx7e2tP/there-is-a-war
  23. namespace. (2020, March 30). Necessity and warrant. The Last Rationalist. https://www.thelastrationalist.com/necessity-and-warrant.html
  24. namespace. (2020, March 23). On necessity. The Last Rationalist. https://www.thelastrationalist.com/on-necessity.html
  25. Miller, G.A. (1920). Missionary morale. Google Books (orig. New York, Cincinnati: The Methodist Book Concern).

167