Jargon is underrated in its importance to the framework of science. Luis Reyes-Galindo points out that jargon quite often ends up literally determining the boundaries of a field. Given the disparity between how much attention is paid to jargon as a subject and its scholarly import, I decided to test the hypothesis that LessWrongers practice weak scholarship in regards to jargon. In particular, that for many important terms the true source of knowledge has not been transmitted to community members. Rather than a pedantic issue, this would imply deep issues with the way that LWers handle knowledge. Without a connection back to original sources, literature review on the part of community members could be severely suppressed.
I started with a weak hypothesis and a strong hypothesis.
Weak Hypothesis: There will be at least one term in this list which respondents misidentify in origin overwhelmingly. In specific, at least 80% of respondents on at least one term or phrase will choose the wrong response.
Strong Hypothesis: There will be at least one term in this list which respondents misidentify in origin overwhelmingly. In addition, at least one third (4) of the words or phrases will have 50% or more of respondents incorrectly identify its origin.
These hypothesis were chosen in advance largely based on my gut intuition about the severity of the problem, and what would consitute 'sufficient evidence' to me that a problem existed.
The methodology for this survey was preregistered here.
A list of search terms I used on Google Scholar for lit review before writing this article can be found here.
A survey was administered to 53 LWers on various chatrooms as well as my personal friends list. This survey contained twelve terms or phrases I felt were especially ambiguous as to their origin. (i.e, they were lacking obvious 'tells' of LW or academic origin)
Inside View/Outside View
Epistemic Learned Helplessness
Chinese Robber Fallacy
Motte and Bailey
Map and Territory
Terminal vs Instrumental Values/Goals
Illusion of Transparency
At least one term was chosen to sound especially academic and one term chosen to sound especially 'LessWrong Diaspora' to provide a baseline. These terms are HtuSvryq and BofreireRssrpg respectively (rot13). All terms or phrases were taken from the Jargon Dictionary hosted by myself. When I chose these terms I myself did not know the origin of several, which was entirely okay because that could be ascertained at analysis time.
The chatrooms I pulled participants from are:
- Brier: An invite only Discord server run by myself.
- #lesswrong: Freenode LessWrong IRC channel.
- SlateStarCodex Discord: The 'official' Discord server of Scott Alexander's blog.
- LessWrong Discord: The unofficial Discord server of the community blog of the same name.
- Exegesis: Tumblr Diaspora Community server.
- LessWrongers Slack: A Slack server run by Elo.
For each term users were asked whether it originated from LessWrong, academia, or neither. (While it is theoretically possible for a term to originate from both at the same time, I'm not aware of this actually happening so I did not consider the possibility in my survey.) To determine the results I use a simple plurality of responses against a 'correct' answer assigned for each term. Whichever answer received the most responses is the one users are determined to have 'chose' in aggregate.
In the following table, green means an answer is what I marked 'correct'. Red means that the amount wrong exceeded the 50% threshold in my weak hypothesis.
Both my weak hypothesis and strong hypothesis were validated. In the case of map and territory, I will ignore the results because of ambiguity. It is quite possible to classify Alfred Korzybski in either the academic or non-academic camps. However, on the Chinese Robber's fallacy 77% of respondents misidentified the origin. While this is not quite 80%, it is close enough for me to consider the weak hypothesis essentially validated. My strong hypothesis was also validated, given that 4 terms (excluding Map/Territory) were significantly over the 50% threshold to count as decided wrong.
My original purpose for this research was to see if there would be any value in adding an etymology section to the jargon dictionary. I think that the outcome of this survey implies the answer is yes. One potential goal of the jargon dictionary is to act as a Rosetta stone between LessWrong Diaspora jargon and academic terminology. The purpose of this would be to make literature review easier for people trying to 'dig deep' on rationality concepts for their research.
Beyond that, these results imply a potentially significant alienation of LessWrongers from the originators of concepts. As Samo Burja points out, the underlying principles that generated an idea are of incredible importance. In failing to transmit the sources of knowledge it's quite possible we're retarding progress by making it non-obvious where to go for more. Worse still the problem is not necessarily easy to fix. Motte & Bailey for example, which Scott properly cites in his post on the subject, got a 64% incorrect response rate. Here I feel it is only appropriate to draw attention to the problem, but welcome potential solutions in the comments.
: Reyes-Galindo, Lewis. (2016). Automating the Horae: Boundary-work in the age of computers. arXiv:1603.03824 [physics.soc-ph]
: Burja, Samo. (2018, March 8). On the Loss and Preservation of Knowledge. Retrieved from https://www.lesserwrong.com/posts/nnNdz7XQrd5bWTgoP/on-the-loss-and-preservation-of-knowledge
: Alexander, Scott. (2014, November 3). All in all, another brick in the motte. Retrieved from http://slatestarcodex.com/2014/11/03/all-in-all-another-brick-in-the-motte/