Richard_Ngo — LessWrong

Formerly alignment and governance researcher at DeepMind and OpenAI. Now independent.

In general, yes. But in this case the thing I wanted an example of was "a very distracting example", and the US left-right divide is a central example of a very distracting example.

Some agreements and disagreements:

I think that memetic forces are extremely powerful and underrated. In particular, previous discussions of memetics have focused too much on individual memes rather than larger-scale memeplexes like AI successionism. I expect that there's a lot of important scientific thinking to be done about the dynamics of memeplexes.
I think this post is probably a small step backwards for our collective understanding of large-scale memeplexes (and have downvoted accordingly) because it deeply entangles discussion of memetic forces in general with the specific memeplex of AI successionism. It's kinda like if Eliezer's original sequences had constantly referred back to Republicans as central examples of cognitive biases. (Indeed, he says he regrets even using religion so much as an example of cognitive bias.) It's also bad form to psychologize one's political opponents before actually responding to their object-level arguments. So I wish this had been three separate posts, one about the mechanics of memeplexes (neutral enough that both sides could agree with it), a second debunking AI successionism, and a third making claims about the memetics of AI successionism. Obviously that's significantly more work but I think that even roughly the same material would be better as three posts, or at least as one post with that three-part ordering.
You might argue that this is justified because AI successionism is driven by unusually strong memetic forces. But I think you could write a pretty similar post with pretty similar arguments except replacing "AI accelerationism" with "AI safety". Indeed, you could think of this post as an example of the "AI safety" memeplex developing a new weapon (meta-level discussions of the memetic basis of the views of its opponents) to defeat its enemy, the "AI successionism" memeplex. Of course, AI accelerationists have been psychologizing safetyists for a while (and vice versa), so this is not an unprecedented weapon, but it's significantly more sophisticated than e.g. calling doomers neurotic.
I'm guilty of a similar thing myself with this post, which introduces an important concept (consenses of power) from the frame of trying to understand wokeness. Doing so has made me noticeably more reluctant to send the post to people, because it'll probably bounce off them if they don't share my political views. I think if I'd been a better writer or thinker I would have made it much more neutral—if I were rewriting it today, for example, I'd structure it around discussions of both a left-wing consensus (wokeness) and a right-wing consensus (physical beauty).

You should probably link some posts, it's hard to discuss this so abstractly. And popular rationalist thinkers should be able to handle their posts being called mediocre (especially highly-upvoted ones).

I think there are other dynamics that are probably as important as 'renouncing antisocial desires' — in particular, something like 'blocks to perceiving aspects of vanilla sex/sexuality' (which can contribute to a desire for kink as nearest-unblocked-strategy)

This seems insightful and important!

Good question. I learned from my last curriculum (the AGI safety fundamentals one) that I should make my curricula harder than I instinctively want to. So I included a bunch of readings that I personally took a long time to appreciate as much as I do now (e.g. Hoffman on the debtor's revolt, Yudkowsky on local validity, Sotala on beliefs as emotional strategies, Moses on The Germans in week 1). Overall I think there's at least one reading per week that would reward very deep thought. Also I'm very near (and plausibly literally on) the global Pareto frontier in how much I appreciate all of MAGA-type politics, rationalist-type analysis, and hippie-type discussion of trauma, embodied emotions, etc. I've tried to include enough of all of these in there that very few people will consistently think "okay, I get it".

Having said that, people kept recommending that I include books, and I kept telling them I couldn't because I only want to give people 20k words max of main readings per week. Given a word budget it seems like people will learn more from reading many short essays than a few books. But maybe that's an artifact of how I personally think (basically, I like to start as broad as possible and then triangulate my way down to specific truths), whereas other people might get more out of going deeper into fewer topics.

I do think that there's not enough depth to be really persuasive to people who go in strongly disagreeing with me on some/all of these topics. My hope is that I can at least convey that there's some shape of coherent worldview here, which people will find valuable to engage with even if they don't buy it wholesale.

The threats of losing one’s job or getting evicted are not actually very scary when you’re in healthy labor and property markets. And we’ve produced so much technological abundance over the last century that our labor and property markets should be flourishing. So insofar as those things are still scary for people today, a deeper explanation for that comes in explaining why our labor and property markets arent very healthy, which comes back to our inability to build and our overrregulation.

But also: yes, there’s a bunch of stuff in this curriculum about exploitation by elites. Somehow there’s a strange pattern though where a lot of the elite exploitation is extremely negative-sum: e.g. so so much money is burned in the US healthcare system, not even transferred to elites (e.g. there are many ways in which being a doctor is miserable which you would expect a healthy system to get rid of). So I focused on paradigm examples of negative-sum problems in the intro to highlight that’s there’s definitely something very Pareto suboptimal going on here.

My version of your synthesis is something like as follows:

This is closer; I'd just add that I don't think activism is too different from other high-stakes domains, and I discuss it mainly because people seem to take activists more at face value than other entities. For example, I expect that law firms often pessimize their stated values (of e.g. respect for the law) but this surprises people less. More generally, when you experience a lot of internal conflict, every domain is an adversarial domain (against parts of yourself).

I think there's a connection between the following
storing passwords in plain text|encrypting passwords on a secure part of the disk|salting and hashing passwords
naive reasoning|studying fallacies and biases|learning to recognise a robust world model
utilitarianism|deontology|virtue ethics

I think you lost the italics somewhere. Some comments on these analogies:

The idea that some types of cognition are "fallacies" or "biases" and others aren't does seem like a pretty deontological way of thinking about the world, insofar as it implicitly claims that you can reason well just by avoiding fallacies and biases.
As the third step in this analogy, instead of "learning to recognize a robust world model", I'd put "carrying out internal compromises", i.e. figuring out how to reduce conflict between heuristics and naive reasoning and other internal subagents.
Re the passwords analogy: yes, deontology and virtue ethics are adversarially robust in a way that utilitarianism isn't. But also, virtue ethics is scalable in a way that deontology isn't, which seems well-captured by the distinction between storing passwords on secure disks vs salting and hashing them.

Thanks for engaging! There's a lot here I agree with—in particular, the concept of pessimization does seem like a dangerous one which could be used to demoralize people. I also think psychoanalyzing me is fair game here, and that it would be a big strike against the concept if I were using it badly.

I'm trying to figure out if there's some underlying crux here, and the part that gets closest to it is maybe:

I think Richard makes an important error when he complains about existing activist-ish groups: he compares these groups to an imaginary version of the activist group which doesn't make any mistakes. Richard seems to see all mistakes made by activist groups as unforced and indicative of deep problems or malice.

I don't know how you feel about the concept of Moloch, but I think you could probably have written a pretty similar essay about that concept. In each individual case you could characterize a coordination failure as just an "ordinary failure", rather than a manifestation of the larger pattern that constitutes Moloch. And indeed your paragraph above is strikingly similar to my own critique of the concept of Moloch, which basically argues that Scott is comparing existing coordination failures to an imaginary world which has perfect coordination. I've also made similar critiques of Eliezer's concept of "civilizational inadequacy" as measuring down from perfection.

I think that the synthesis here is that neither pessimization nor Moloch nor "civilizational inadequacy" should be treated as sufficiently load-bearing that they should tell you what to do directly. In some sense all of these create awayness motivations: don't pessimize, don't be inadequate, don't let Moloch win. But as Malcolm Ocean points out, awayness motivations are very bad for steering. If your guiding principle is not to be inadequate, then you will probably not dream very big. If your guiding principle is not to pessimize, then people will probably just throw accusations of pessimization at each other until everything collapses into a big mess.

That's why I ended the post by talking about virtue ethics, and how it can be construed as a technology for avoiding pessimization. I want to end up in a place where people almost never say to each other "stop pessimizing", they instead say "be virtuous". But in order to argue for virtues as the solution to pessimization/the way to build the "imaginary version" of groups which don't make such unforced errors, I need to first point at one of the big problems they're trying to solve. It's also worth noting that a major research goal of mine is to pin down mechanisms of pessimization more formally and precisely, and if I fail then that should count as a significant strike against the concept.

I'm not 100% sure that this is the right synthesis, and will need to muse on it more, but I appreciate your push to get this clearer in my head (and on LessWrong).

Lastly, at risk of turning this political, the one thing I'll say about the "support Hamas" stuff is that there's a spectrum of what counts as "support", from literally signing up to fight for them to cheering them on to dogwhistling in support of them to just pushing for some of the same goals that they do to failing to condemn them. My contention is that there are important ways in which Hamas' lack of alignment with western values leads to more western support for them—e.g. the wave of pro-Palestine rallies immediately after they killed many civilians—which is what makes this an example of pessimization. Of course this is a dangerous kind of accusation because there's a lot of wiggle room in exactly what we mean by "lack of alignment", and distinctions between supporting Hamas itself vs supporting associated causes. I personally still think the effect is stark enough that my core point was correct, but I should have phrased it more carefully. (Note: I edited this paragraph a few mins after writing it, because the original version wasn't very thoughtful.)

Can you give some examples of people with vibrant will-to-Goodness?

My guess is that the people who are unusually disembodied that you're thinking of probably suppress a kind of contempt and/or anger at other people who don't have so much will-to-Goodness.

LESSWRONG
LW

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments