I've been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I've been interested in improving my own epistemic standards and helping others to do so as well.


Privacy Practices
The LessWrong Review
Keep your beliefs cruxy and your frames explicit
Kickstarter for Coordinated Action
Open Threads
LW Open Source Guide
Tensions in Truthseeking
Project Hufflepuff
Rational Ritual
Load More (9/10)


[Part 2] Amplifying generalist research via forecasting – results from a preliminary exploration

It's unclear to me whether I should think of the forecasters as more replaceable than Elizabeth. If they're all generalist researchers, having "a bunch of generalist researchers do generalist research for the same amount of time as the original researcher" doesn't seem obviously scalable.

(That said, my current belief is that this work was pretty interesting and important overall)

From Personal to Prison Gangs: Enforcing Prosocial Behavior

I didn't notice until just recently that this post fits into a similar genre as (what I think) the Moral Mazes discussion is pointing at (which may be different from what Zvi thinks).

Where one of the takeaways from Moral Mazes might be: "if you want your company to stay aligned, try not to grow the levels of hierarchy too much, or be extremely careful when you do."

"Don't grow the layers of hierarchy" is (in practice) perhaps a similar injunction to "don't grow the company too much at all" (since you need hierarchy to scale)

Immoral Mazes posits a specific failure due to middle managers being disconnected from reality, and evolving an internal ecosystem that then sets out to protect itself. This post points at an (upstream?) issue where, regardless of middle management, the fundamental reality is that people cannot rely on repeated-interactions to build trust. 

I actually totally forgot until just now the final paragraph, and key point of this post

So if we want to e.g. reduce regulation, we should first focus on the underlying socioeconomic problem: fewer interactions. A world of Amazon and Walmart, where every consumer faces decisions between a million different products, is inevitably a world where consumers do not know producers very well. There’s just too many products and companies to keep track of the reputation of each. To reduce regulation, first focus on solving that problem, scalably. Think amazon reviews - it’s an imperfect system, but it’s far more flexible and efficient than formal regulation, and it scales.

Now for the real problem: online reviews are literally the only example I could come up with where technology offers a way to scale-up reputation-based systems, and maybe someday roll back centralized control structures or group identities. How can we solve these sorts of problems more generally? Please comment if you have ideas.

I think this maps to one of the key (according to me) problems raised by the Immoral Mazes sequence: we don't know how to actually identify and reward competence among middle managers, so all we have are easily goodhartable metrics. (And in the case of middle management that there's a deep warping that happens because the thing that got goodharted on was "office politics")

Unfortunately... well, nobody commented with ideas on this post, and also I don't know that anyone came up with any way to track competence of management either. 

The actionable place where this matters in my local environment is EA grantmakers awarding researchers, which are often pretty hard to evaluate. I think this is a serious bottleneck to scaling efforts.

I notice that forecasting is one of few domains where rationalsphere-folk are experimenting with scalable solutions for evaluation. I've been somewhat pessimistic about forecasting, but I think this might have convinced me to allocate more attention to it.

...huh. I would not have expected this post to be closely associated in my head with amplifying generalist forecasting. But, now I think it is.

From Personal to Prison Gangs: Enforcing Prosocial Behavior

I just wanted to say "thanks for actually doing an epistemic spot check here". I think* I currently endorse John's response explanation about why he doesn't think "sharp increase in prisoners" is the thing to be looking for, but I think doing any kind of serious spot check is big chunk of work that's often not as rewarding as it should be. Have a strong upvote.

The Great Karma Reckoning

Insofar as "total karma should try to approximate ideal 'Rationalist Social Status'", I think ideally it would incorporate things like "how often do they introduce novel information that turns out to actually be true." 

And this suggests a bunch of things like "tracking their predictions / success" and "noticing when their early ideas were relevant to things that later get well established as true." Which all seems like important things to figure out how to do. But I'm not sure whether it fits into the abstraction of what karma currently is. One key goal of karma is "dole out bits of reward that create a positive incentive gradient to follow", which is a pretty different goal than "track total idealized social status."

Currently we go out of our way to not make people's total karma super prominent (it doesn't appear when you mouseover their username, nor does your karma appear at the top of each page, like it did on old lesswrong).

Partial summary of debate with Benquo and Jessicata [pt 1]

This was the first major, somewhat adversarial doublecrux that I've participated in.

(Perhaps this is a wrong framing. I participated in many other significant, somewhat adversarial doublecruxes before. But, I dunno, this felt significantly harder than all the previous ones, the point where it feels like a difference in kind)

It was a valuable learning experience for me. My two key questions for "Does this actually make sense as part of the 2019 Review Book" are:

  • Is this useful to others for learning how to doublecrux, pass ITTs, etc in a lowish-trust-setting?
  • Is this useful to others takeaways for the object level disagreement?

On the object level, my tl;dr takes the form of "which blogposts should someone write as a followup?", which I think are:

  1. Criticism and Accusations are importantly different, and should be distinguished. (I think some miscommunication came people implicitly lumping these together)
  2. Harmony matters for group truthseeking, but, Alice telling Bob 'you should be nicer. Here's how to say the same thing more nicely' is really scary if 
    Alice in fact didn't understand exactly what they were trying to say. 

    (I realize Benquo/Jessica probably still disagreement with my beliefs/emphasis on the first part. But this was a concrete update I made while reviewing the post, and a mistake I think I was making a lot. Even if I later change my mind about how much harmony matters for group truthseeking, it'd still be necessary for the post to directly address the benefits in order to be understood by past-me)

I think there are more points worth lifting out of here, but I'm not sure how oddly specific they were to the particular people in this conversation, rather than generally useful.

On "how did this go as a doublecrux", I notice:

  1. Well, I learned a lot, at least on the object level.
  2. We didn't reach total or even significant agreement. Benquo/Jessica I think began commenting less sometime after this. 

    That might be fine. I don't think this is was Benquo/Jessica's goal (I think their goal was more like 'figure out if the LessWrong Team is aligned with them enough to be worth investing in LessWrong', and I think they succeeded at that)
  3. I'm a bit sad about the outcome, but not sure whether I should be. I do think if this was my second rodeo I could have done better. (I'm guessing in large part everyone involved burned through their budget for having intense disagreements with people who didn't seem aligned, and that most of that budget was burned through prior to the final conversations on mistakes that could have been avoided, and if we had had another month of energy we could have actually figured out enough common ground to be more closely allied.)
  4. That said, this was the fuel that eventually output Noticing Frames and Propagating Facts into Aesthetics, which I'm pretty happy with.

On "can other people learn from this as a doublecrux?"

I... don't know. I think maybe, but that's mostly up to other people.

Note on Framing:

I notice is that a large chunk of the text of this post are direct quotes from Benquo and Jessicata, but it's wrapped in a post where I control the frame. If this were considered for inclusion-in-the-book, I'd be interested in having them write reviews of their year-later-takeaways, written in their own frames.

Some notes regarding object level ideas in this post and the discussion:

(each quoted section is basically a new topic)

Benquo: (emphasis mine)

What I see as under threat is the ability to say in a way that's actually heard, not only that opinion X is false, but that the process generating opinion X is untrustworthy, and perhaps actively optimizing in an objectionable direction. Frequently, attempts to say this are construed primarily as moves to attack some person or institution, pushing them into the outgroup. Frequently, people suggest to me an "equivalent" wording with a softer tone, which in fact omits important substantive criticisms I mean to make, while claiming to understand what's at issue.

While I still have many complaints about the overall strategy Benquo was following at the time, I think (hope?) I'm more understanding now about the failure mode pointed at here. I do think I've contributed to that failure mode, i.e. "try to be more diplomatic to preserve group harmony, in a way that comes at expense of clarity."

I still think there are good truthseeking reasons to preserve group harmony. But I think the concrete updates I've made are that we need to (at least) be very clear about when we're doing that, and notice when attempts to smooth things over are destroying information.

In particular, it is pretty orwellian/gaslighty to have someone tell you "You're being too mean. Here's a different thing you could have said with the same truth value that wouldn't have been as mean, see?" and watch in horror as they then describe a sentence that leaves out important information you meant to convey. 

In my other review, I mentioned "hmm, I think I still have promises to keep regarding 'what aesthetic updates should I make?'". I think one aesthetic update I am happy to make is that I should have some kind of disgust/horror when someone (including me) claims to be preserving local truth value, or implying that truth value is preserved, when in fact it wasn't. 

(This is a specific subset of the overall worldview/aesthetic I think Benquo was trying to convey, and I'm guessing there is still major disagreement in other nearby areas)


It seemed like the hidden second half of the core claim [of the "5 words" post]was "and therefore we should coordinate around simpler slogans," and not the obvious alternative conclusion "and therefore we should scale up more carefully, with an uncompromising emphasis on some aspects of quality control." (See On the Construction of Beacons for the relevant argument.)

It seemed to me like there was some motivated ambiguity on this point. The emphasis seemed to consistently recommend public behavior that was about mobilization rather than discourse, and back-channel discussions among well-connected people (including me) that felt like they were more about establishing compatibility than making intellectual progress

I am still mulling this over. I think it might be pointing at something I haven't yet fully grokked.

I would agree with the phrase "we should scale up more carefully, with an uncompromising emphasis on some aspects of quality control". (I think I would have agreed with it at the time, which is part of why the doublecrux was tricky. I eventually realized that Benquo meant a stronger version of this sentence than I meant)

My current (revealed) belief is something like "We don't really have the luxury of stopping all mobilization while we figure out the ideal coordination mechanisms. Meanwhile I think current mobilization efforts are net positive. I also think the process of actually mobilizing is also useful for forcing your ivory tower coordination process to be more connected with the reality of how large scale coordination actually works."

(My understanding is that Benquo-at-the-time thought the current way large scale coordination works is fundamentally doomed and don't have much choice but to start over. That does feel pretty cruxy – if I believed that I'd be doing different things.)


This, even though it seems like you explicitly agree with me that our current social coordination mechanisms are massively inadequate, in a way that (to me obviously) implies that they can't possibly solve FAI.

"Can't possibly solve FAI" still sounds like an obviously false marketing claim to me. I wrote a blogpost arguing you should be suspicious when you find yourself saying this.

(By contrast, I do agree with the first half of the sentence, that our current coordination mechanisms are massively inadequate, and am grateful for various gears about what's going on there that I gained during this conversation)


That said, there's still a complicated question of "how do you make criticisms well". I think advice on this is important. I think the correct advice usually looks more like advice to whistleblowers than advice for diplomacy.

This feels aesthetically cruxy. I think it's a few steps removed from whatever the real disagreement is about. 

I think a key piece here is the distinction between "criticism" and "accusations of norm violation." I mention this at the bottom of the post, but I think it warrants a separate top level post that delves into more details. 


Note, my opinion of your opinions, and my opinions, are expressed in pretty different ontologies. 

One thing I noticed at the time and still notice now is that it's not actually obvious to me (from Jessica's written words in the preceding section) that our claims are in different ontologies. I derive that they must be in different ontologies (given observations about how challenging this whole conversation was). But, it is worth noting that Jessica's claims/beliefs seem to make sense in my ontology.

Zack, in the comments:

> politics backpropogates into truthseeking, causes people to view truthseeking norms as a political weapon.

Imagine that this had already happened. How would you go about starting to fix it, other than by trying to describe the problem as clearly as possible (that is, "invent[ing] truthseeking-politics-on-the-fly")?

I was distracted by another piece of this comment, but I agree that having a good answer for this is pretty important.

"Defining Clarity"

After writing this post, there was significant disagreement in the comments about this line of mine:

I define clarity in terms of what gets understood, rather than what gets said. So, using words with non-standard connotations, without doing a lot of up-front work to redefine your terms, seems to me to be reducing clarity, and/or mixing clarity, rather than improving it.

I'm still not entirely sure what happened here, but the failure mode that Jessica/Zvi/Zack were pointing at was "You auto-lose if you incentive people not to understand." That seems true to me, but mostly unrelated to what I was trying to say here, and some of my own response was perhaps overly exasperated with them seeming to change the subject on me.

Zvi eventually said:

Imagine three levels of explanation: Straightforward to you, straightforward to those without motivated cognition, straightforward even to those with strong motivated cognition.

It is reasonable to say that getting from level 1 to level 2 is often a hard problem, that it is on you to solve that problem.

It is not reasonable, if you want clarity to win, to say that level 2 is insufficient and you must reach level 3. It certainly isn't reasonable to notice that level 2 has been reached, but level 3 has not, and thus judge the argument insufficient and a failure. It would be reasonable to say that reaching level 3 would be *better* and suggest ways of doing so.

I think it's possible that at that point I could have said "Okay. I'm talking about level 2, and the point is you make it much harder to get to level-2 if you're making up new words or using them with nonstandard connotations." But by the time we got to that point of the conversation I was pretty exhausted and still confused about how everything fit together. Today, I'm not 100% sure whether my hypothetical reply was straightforwardly true.


I feel like I want to tie this all up together somehow, but I think I mostly did that in the tl;dr at the top. Thanks for reading I guess. Still interested in delving into individual threads if people are interested.

What are the open problems in Human Rationality?

So I'm not sure I'd include this in the Best Of book in the first place. If I did, I agree it'd be pretty obviously wrong to imply that the list was comprehensive. I didn't think that was implied by the post – if you ask a question, usually you don't end up getting a comprehensive answer right away. 

As a post on a live forum, I think it's pretty obvious that this isn't a comprehensive list – if it's missing things, people are supposed to just add those things, and you should expect it to need updating over time.

In the case of a printed book, I'm not sure if the right thing is to change the title, or just make sure to say "here are some specific answers this question post got." Either seems potentially fine to me.

I very much don't think the title of the LessWrong post itself should change – it's trying to ask a question, not spell out any particular expectation of an answer.

The Schelling Choice is "Rabbit", not "Stag"

I still have "write my own self-review" of this post, which I think will at least partially address some of that concern. (But, since we had an argument about this as recently as Last Tuesday*, obviously it won't fully address it)

I do want to note that I think the point your primarily making here (about misleading/bad effects of the staghunt frame) doesn't feel super encapsulated in your current response post, and probably is worth a separate top level post.

(But, tl;dr for my current take is "okay, this post replaces the wrong-frame of prisoner's dilemma with the wrong-frame of staghunts, which I still think was a net improvement. But, "what actually ARE the game theoretical situations we actually find ourselves in most of the time?" is the obvious next question to ask)

* it wasn't actually Tuesday.

Integrating the Lindy Effect

I just noticed that I still forgot what this post was about, while voting, and didn't even remember that I gave it this glowing review. Which makes me think the title isn't that great.

Eli's shortform feed

Probably this one?


Load More