This was originally written in December 2021. I think it's good practice to run posts by orgs before making them public, and I did in this case. This has some benefits: orgs aren't surprised, they can prepare a response in advance if they want to, they can point out errors before things become public, etc, and I think it's generally worth doing. In this case Leverage folks pointed out some errors, omissions, and bad phrasing in my post, which I've fixed, and I'm thankful for their help. Pre-publication review does also have downsides, however, and in this case as the email conversations grew to 10k+ words over three weeks I ran out of time and motivation.

A month ago I came across this in my list of blog drafts and decided to publish it as-is with a note at the top explaining the situation. This means that it doesn't cover any more recent Leverage developments, including their Experiences Inquiry Report and On Intention Research paper, both published in April 2022. I shared this post again with Leverage, and while I've made edits in response to their feedback they continue to disagree with my conclusion.

In the original pre-publication discussion with Geoff, one of the topics was whether we could make our disagreement more concrete with a bet. For example, research that launches a new subfield generally gets lots of citations, such as the the Concrete Problems paper (1,644 citations at 6 years), and if Leverage 1.0's research ends up having this kind of foundational impact this could be a clear way to tell. When I gave Leverage a second pre-publication heads up, Geoff and I talked more and we were able to nail down some terms: if a Leverage paper drawing primarily on their pre-2019 research has 100+ citations from people who've never worked for Leverage by 2032-10-01, then I'll donate $100 to a charity of Geoff's choosing; if not then Geoff will do the same for a charity of my choosing. I've listed this on my bets page.

In 2011, several people I knew through Boston-area effective altruism and rationality meetups started an organization called Leverage Research. Their initial goal was to "make the world a much better place, using the most effective means we can", and they worked on a wide range of projects, but they're probably best known for trying to figure out how to make people more productive/capable/successful by better understanding how people think and interact. They initially lived and worked in a series of shared houses, first in Brooklyn and then in Oakland; I visited the latter for an evening in early 2014. The core project ("Leverage 1.0") disintegrated in 2019, with some portions continuing, including Paradigm (training/coaching) and Leverage 2.0 (early stage science). In this post I'm only looking at Leverage 1.0, and specifically at their psychology research program.

In mid-December 2021, Leverage's former head of operations, Cathleen, wrote In Defense of Attempting Hard Things: and my story of the Leverage ecosystem (LW comments), giving a detailed history with extensive thoughts on many aspects of the project. I remember her positively from my short 2014 visit, and I'm really glad she took the time to write this up.

There are many directions from which people could approach Leverage 1.0, but the one that I'm most interested in is lessons for people considering attempting similar things in the future.

My overall read of Cathleen's post is that she (and many other ex-Leverage folks) view the project as one where a group of people took an unorthodox approach to research, making many deep and important discoveries about how people think and relate to each other. I've read the Connection Theory paper and the four research reports Geoff has published (see Appendix 3), however, and I don't see anything in them that backs up these claims about the originality and insight of their psychology research. While there are a range of reasons why people might not write up even novel and valuable results, I think the most likely explanation is that there weren't discoveries on the level they're describing.

Geoff is still gradually writing up Leverage 1.0-era results, so it's possible that something will come out later that really is impressive. While this isn't what I'm expecting, if it happens I'll need to retract most of what follows. [2022-09: this is essentially what Geoff and I bet on above.]

If there weren't any big psychology research breakthroughs, however, why would they think there were? Putting together my reading of Cathleen's post (Appendix 1), Larissa's post (Appendix 2), and a few other sources (Appendix 3), here's what I see as the most likely story: "The core problem was that Leverage 1.0 quickly became much too internally focused. After their Connection Theory research did not receive the kind of positive response they were hoping for they stopped seeing publishing as a way to get good feedback. With an always-on dynamic and minimal distinction between living and working space during their formative years, their internal culture, practice, and body of shared knowledge diverged from mainstream society, academia, and the communities they branched from. They quickly got to where they felt people outside the group didn't have enough background to evaluate what they were doing. Without enough deep external engagement, however, it was too hard for them to tell if their discoveries were actually novel or valuable. They ended up putting large amounts of effort into research that was not just illegible, but not very useful. They gave up their best sources of external calibration so they could move faster, but then, uncalibrated, put lots of effort into things that weren't valuable."

[2022-09: I'm not saying that the people researching psychology at Leverage were poor thinkers. Instead my model is that when people are operating without good feedback loops they very often do work that isn't useful but believe that it is. This is part of why I was pessimistic on circa-2015 AI safety work (and is still a reason I'm skeptical of a lot of AI safety work today) and worried about a dynamic of inattentive funders for meta-EA projects (also still a problem). Similarly, I think the replication crisis was primarily a problem of researchers thinking they had meaningful feedback loops when they didn't.]

In assessing a future research project I wouldn't take "that looks a lot like Leverage" as any sort of strong argument: Leverage 1.0 was a large effort over many years, encompassing many different approaches. Instead I would specifically look at its output and approach to external engagement: if they're not publishing research I would take that as a strong negative signal for the project. Likewise, in participating in a research project I would want to ensure that we were writing publicly and opening our work to engaged and critical feedback.

Appendix 1: some extracts from Cathleen's post that crystallized the above for me:

  • "[T]he pace of discovery and development and changes in the structure and composition of the team was too fast to allow for people to actually keep up unless they were in the thick of it with us"

  • "From the outside (and even sometimes from the inside) this would look like unproductive delusion, but in fact it was intentional and managed theoretical exploration. And it led to an enormous amount of what many in the group came away believing were accurate and groundbreaking theories of how the mind works and how a personality is shaped by life."

  • "For a small group of untrained people to independently derive/discover so much in a handful of years does, I think, indicate something quite unusual about Geoff's ability to design a productive research program."

  • "I think it's worth pausing to appreciate just how bad the conflict leading up to the dissolution, as well as the dissolution itself, was for a number of people who had been relying on the Leverage ecosystem for their life plans: their friends, their personal growth, their livelihood, their social acceptance, their romantic prospects, their reputations, their ability to positively impact the world."

  • The entire "What to do when society is wrong about something?" section.

Appendix 2: the same for Larissa's post:

  • "From the outside, Leverage's research was understandably confusing because they were prioritising moving through a wide range of research areas as efficiently as possible rather than communicating the results to others. This approach was designed to allow them to cover more ground with their research and narrow in quickly on areas that seemed the most promising."

  • "Notably, Leverage's focus was never particularly on sharing research externally. Sometimes this was because it was a quick exploration of a particular avenue or seemed dangerous to share. Often though it was a time trade-off. It takes time to communicate your research well, and this is especially challenging when your research uses unusual methodology or starting assumptions."

  • "[T]here was a trade-off in time spent conducting research versus time spent communicating it. As we didn't invest time early on in communicating about our work effectively, it only became harder over time as we built up our models and ontologies."

  • "One of the additional adverse effects of our poor public communication is that when Leverage staff have interacted with people, they often didn't understand our work and had a lot of questions and concerns about it. While this was understandable, I think it sometimes led staff to feel attacked which I suspect, in some cases, they handled poorly, becoming defensive and perhaps even withdrawing from engaging with people in neighbouring communities. If you don't build up relationships and discuss updates to your thinking inferential distance builds up, and it becomes easy to see some distant, amorphous organisation rather than a collection of people."

Appendix 3: earlier public discussion of Leverage 1.0, which I've also drawn on in trying to understand what happened:

  • January 2012: Geoff, Leverage's primary founder and Executive Director, writes Introducing Leverage Research. The post directs people to the website for more information about their research, which has a link to download Connection Theory: the Current Evidence. The comments have a lot of skeptical discussion of Connection Theory, with back-and-forth from Geoff.

  • September 2012: Peter writes A Critique of Leverage Research's Connection Theory. His conclusion is that the evidence presented is pretty weak and that it's in conflict with a lot of what we do know about psychology. The comments again have good engagement. At some point before Alyssa's 2014 post (below) Leverage removed the Connection Theory paper from their site. [2022-10: Cathleen tells me this was in the 2013 redesign of their site, and linked me to before and after captures.]

  • April 2014: Alyssa writes The Problem With Connection Theory, digging deeper into some of the claims of the Connection Theory paper. She argues that the paper generally oversells its evidence, and highlights that several predictions which one would not normally judge as correct are counted as positive evidence. Jasen, a Leverage employee, responds in the comments to say Alyssa is criticizing an obsolete document.

  • January 2015: Evan comments with his understanding of why Leverage hasn't shared much publicly, including that he thinks "Leverage Research perceives it as difficult to portray their research at any given time in granular detail. That is, Leverage Research is so dynamic an organization at this point that for it to maximally disclose the details of its current research would be an exhaustive and constant effort."

  • [2022-10: Cathleen tells me that in June 2016 she asked the Internet Archive to exclude Leverage so that people would focus on the new content on their website. Because the site isn't included in the Archive, I wasn't able to evaluate the historical content of their website in putting together this post or formulating my hypothesis above. She also linked me to a pair of captures, one from 2013 and another from 2018 showing blog posts published in 2016. I haven't evaluated these captures.]

  • August 2018: Ryan writes Leverage Research: reviewing the basic facts anonymously as "throwaway", and then following up as "anonymoose" (both of which he's since publicly confirmed). His high-level point is that Leverage seemed to have produced very little given the amount of time and money put into the project. Geoff replies that he had been planning to publish some of their results shortly.

  • November 2019: Larissa, Leverage's incoming communications person, posts Updates from Leverage Research: history, mistakes and new focus, expanding on a comment she had made in September. She discusses history, dissolution of the original project, and current plans. I was especially interested in her discussion of the causes and effects of Leverage's approach to external engagement.

  • December 2020 through October 2021: Geoff links a series of four "Leverage 1.0 Research Reports" on his personal site, three on consensus (1, 2, 3) and one on intelligence amplification (4). I haven't seen any discussion of these. I'm very glad he wrote them up and made them public, but I also don't see in them the kind of breakthroughs I would expect from how Cathleen wrote about Leverage 1.0's work.

  • September 2021: Someone anonymous notices that Geoff is fundraising, and posts Common knowledge about Leverage Research 1.0. They argue that Leverage was a harmful "high demand group". Lots of different perspectives in the comments.

  • September 2021: Larissa posts Updates from Leverage Research: History and Recent Progress. In the section on Leverage's Exploratory Psychology Program she discusses their plans to release psychological research tools over the next few months.

  • October 2021: An anonymous former Leverage employee writes about their experience there, and how it "really mismatched the picture of Leverage described by" the 'Common Knowledge' post.

  • October 2021: Zoe, one of their former researchers, posted about her experience there. See the corresponding LessWrong post for discussion. Also see Geoff's response.

  • December 2021: Jonathan, another former researcher, posts Leverage Research: Context, Analysis, and Takeaway. The "Utopic Mania and Closed-offness" section was the most interesting to me, but it is sufficiently metaphorical that I don't really understand it.

  • December 2021: Cathleen, Leverage's former COO, wrote In Defense of Attempting Hard Things: and my story of the Leverage ecosystem. This is the article that prompted my post, and I'm sad about how little coverage and consideration than it received compared to some of the less informative posts above.

Comment via: facebook

New Comment
3 comments, sorted by Click to highlight new comments since:

Some important implications of this:

As you said, AI safety lacks good feedback loops, compared to capabilities feedback loops. Thus 3 scenarios occur: Either AI safety doesn't matter at all (We can't build AGI or it's easy to align by default), we are doomed because feedback loops can't be done in AI Alignment/Safety, or we by default succeed. It's similar to John Wentworth's post: When iterative design fails, linked here:

Now, John Wentworth's stories about gunpowder and the medieval lord is overstating things, but if we look at modern weapons vs medieval lords, it's usually a win for the modern soldiers unless severe skewing of numbers occur (more like 1:100 or more) or the modern force has too small a frontage.

Another implication is that I understand why academia/meta work is stereotyped as being out of touch with reality by populists, even if I suspect that this is actually at least somewhat wrong.

Where can I read an introduction to Connection Theory (i.e. a thesis statement, main claims, etc)?

I'd say it's for historical interest only at this point, but in Appendix 3 I link a copy of the paper that Alyssa archived: (v2.1, last revised 2011-09-21) It includes an introduction to the ideas.