Research Deprioritizing External Communication — LessWrong

x

Research Deprioritizing External Communication — LessWrong

This was originally written in December 2021. I think it's good practice to run posts by orgs before making them public, and I did in this case. This has some benefits: orgs aren't surprised, they can prepare a response in advance if they want to, they can point out errors before things become public, etc, and I think it's generally worth doing. In this case Leverage folks pointed out some errors, omissions, and bad phrasing in my post, which I've fixed, and I'm thankful for their help. Pre-publication review does also have downsides, however, and in this case as the email conversations grew to 10k+ words over three weeks I ran out of time and motivation.

A month ago I came across this in my list of blog drafts and decided to publish it as-is with a note at the top explaining the situation. This means that it doesn't cover any more recent Leverage developments, including their Experiences Inquiry Report and On Intention Research paper, both published in April 2022. I shared this post again with Leverage, and while I've made edits in response to their feedback they continue to disagree with my conclusion.

In the original pre-publication discussion with Geoff, one of the topics was whether we could make our disagreement more concrete with a bet. For example, research that launches a new subfield generally gets lots of citations, such as the the Concrete Problems paper (1,644 citations at 6 years), and if Leverage 1.0's research ends up having this kind of foundational impact this could be a clear way to tell. When I gave Leverage a second pre-publication heads up, Geoff and I talked more and we were able to nail down some terms: if a Leverage paper drawing primarily on their pre-2019 research has 100+ citations from people who've never worked for Leverage by 2032-10-01, then I'll donate $100 to a charity of Geoff's choosing; if not then Geoff will do the same for a charity of my choosing. I've listed this on my bets page.

In 2011, several people I knew through Boston-area effective altruism and rationality meetups started an organization called Leverage Research. Their initial goal was to "make the world a much better place, using the most effective means we can", and they worked on a wide range of projects, but they're probably best known for trying to figure out how to make people more productive/capable/successful by better understanding how people think and interact. They initially lived and worked in a series of shared houses, first in Brooklyn and then in Oakland; I visited the latter for an evening in early 2014. The core project ("Leverage 1.0") disintegrated in 2019, with some portions continuing, including Paradigm (training/coaching) and Leverage 2.0 (early stage science). In this post I'm only looking at Leverage 1.0, and specifically at their psychology research program.

In mid-December 2021, Leverage's former head of operations, Cathleen, wrote In Defense of Attempting Hard Things: and my story of the Leverage ecosystem (LW comments), giving a detailed history with extensive thoughts on many aspects of the project. I remember her positively from my short 2014 visit, and I'm really glad she took the time to write this up.

There are many directions from which people could approach Leverage 1.0, but the one that I'm most interested in is lessons for people considering attempting similar things in the future.

My overall read of Cathleen's post is that she (and many other ex-Leverage folks) view the project as one where a group of people took an unorthodox approach to research, making many deep and important discoveries about how people think and relate to each other. I've read the Connection Theory paper and the four research reports Geoff has published (see Appendix 3), however, and I don't see anything in them that backs up these claims about the originality and insight of their psychology research. While there are a range of reasons why people might not write up even novel and valuable results, I think the most likely explanation is that there weren't discoveries on the level they're describing.

Geoff is still gradually writing up Leverage 1.0-era results, so it's possible that something will come out later that really is impressive. While this isn't what I'm expecting, if it happens I'll need to retract most of what follows. [2022-09: this is essentially what Geoff and I bet on above.]

If there weren't any big psychology research breakthroughs, however, why would they think there were? Putting together my reading of Cathleen's post (Appendix 1), Larissa's post (Appendix 2), and a few other sources (Appendix 3), here's what I see as the most likely story: "The core problem was that Leverage 1.0 quickly became much too internally focused. After their Connection Theory research did not receive the kind of positive response they were hoping for they stopped seeing publishing as a way to get good feedback. With an always-on dynamic and minimal distinction between living and working space during their formative years, their internal culture, practice, and body of shared knowledge diverged from mainstream society, academia, and the communities they branched from. They quickly got to where they felt people outside the group didn't have enough background to evaluate what they were doing. Without enough deep external engagement, however, it was too hard for them to tell if their discoveries were actually novel or valuable. They ended up putting large amounts of effort into research that was not just illegible, but not very useful. They gave up their best sources of external calibration so they could move faster, but then, uncalibrated, put lots of effort into things that weren't valuable."

[2022-09: I'm not saying that the people researching psychology at Leverage were poor thinkers. Instead my model is that when people are operating without good feedback loops they very often do work that isn't useful but believe that it is. This is part of why I was pessimistic on circa-2015 AI safety work (and is still a reason I'm skeptical of a lot of AI safety work today) and worried about a dynamic of inattentive funders for meta-EA projects (also still a problem). Similarly, I think the replication crisis was primarily a problem of researchers thinking they had meaningful feedback loops when they didn't.]

In assessing a future research project I wouldn't take "that looks a lot like Leverage" as any sort of strong argument: Leverage 1.0 was a large effort over many years, encompassing many different approaches. Instead I would specifically look at its output and approach to external engagement: if they're not publishing research I would take that as a strong negative signal for the project. Likewise, in participating in a research project I would want to ensure that we were writing publicly and opening our work to engaged and critical feedback.

Appendix 1: some extracts from Cathleen's post that crystallized the above for me:

"[T]he pace of discovery and development and changes in the structure and composition of the team was too fast to allow for people to actually keep up unless they were in the thick of it with us"
"From the outside (and even sometimes from the inside) this would look like unproductive delusion, but in fact it was intentional and managed theoretical exploration. And it led to an enormous amount of what many in the group came away believing were accurate and groundbreaking theories of how the mind works and how a personality is shaped by life."
"For a small group of untrained people to independently derive/discover so much in a handful of years does, I think, indicate something quite unusual about Geoff's ability to design a productive research program."
"I think it's worth pausing to appreciate just how bad the conflict leading up to the dissolution, as well as the dissolution itself, was for a number of people who had been relying on the Leverage ecosystem for their life plans: their friends, their personal growth, their livelihood, their social acceptance, their romantic prospects, their reputations, their ability to positively impact the world."
The entire "What to do when society is wrong about something?" section.

Appendix 2: the same for Larissa's post:

"From the outside, Leverage's research was understandably confusing because they were prioritising moving through a wide range of research areas as efficiently as possible rather than communicating the results to others. This approach was designed to allow them to cover more ground with their research and narrow in quickly on areas that seemed the most promising."
"Notably, Leverage's focus was never particularly on sharing research externally. Sometimes this was because it was a quick exploration of a particular avenue or seemed dangerous to share. Often though it was a time trade-off. It takes time to communicate your research well, and this is especially challenging when your research uses unusual methodology or starting assumptions."
"[T]here was a trade-off in time spent conducting research versus time spent communicating it. As we didn't invest time early on in communicating about our work effectively, it only became harder over time as we built up our models and ontologies."
"One of the additional adverse effects of our poor public communication is that when Leverage staff have interacted with people, they often didn't understand our work and had a lot of questions and concerns about it. While this was understandable, I think it sometimes led staff to feel attacked which I suspect, in some cases, they handled poorly, becoming defensive and perhaps even withdrawing from engaging with people in neighbouring communities. If you don't build up relationships and discuss updates to your thinking inferential distance builds up, and it becomes easy to see some distant, amorphous organisation rather than a collection of people."

Appendix 3: earlier public discussion of Leverage 1.0, which I've also drawn on in trying to understand what happened:

Comment via: facebook