Motivation

Claim (80% confidence): At least 50% of the disagreement between people who align more with MIRI/Eliezer and those who align more with opposing clusters of views (Christiano, Garfinkel, etc.) is caused not by rational disagreement, but by more subconscious emotional stuff like tastes, vibes, culture, politics, etc.

In particular, I think Eliezer’s writing tends to appeal more to people who:

Enjoy reading fiction, and are okay with points primarily being made via parables
1. Similarly, don’t mind long pieces with no summaries and a lack of clear organizational structure of the claims being made and the evidence for them
Aren’t turned off by perceived arrogance (I’m not taking a stance in this post on whether the arrogance level is justified or not)

Past attempts to communicate their worldview such as the MIRI conversations have helped some, but I think mostly weren’t attacking the core issue for which my current guess is large differences in communication styles.

For people who aren’t high on the above axes, I think Eliezer’s writing often tends to be fairly hard to read and offputting, which is unfortunate. It leads to people not taking the points as seriously as they should (and perhaps has the opposite effect on those who the style resonates with). While I disagree with MIRI/Eliezer on a lot of topics, I agree with them to some extent on a lot and think it’s very valuable to understand their worldview and build a “MIRI-model”.

I’ve written briefly about my personal experience taking Eliezer and MIRI less seriously than I should have here. I still haven’t read most of the sequences and don’t intend to read HPMOR, but I now take MIRI and Eliezer’s worldviews much more seriously than I used to.

Proposal

An idea to mitigate this is to give out substantial prizes for write-ups which transmit important aspects of the MIRI/Eliezer worldview in ways which are easier for people with different tastes to digest.

A few possible candidates for this “transmission” include (not an exhaustive list!):

The Sequences
Eliezer’s and Nate’s portion of the MIRI conversations
2022 MIRI Alignment Discussion
1. AGI Ruin: A List of Lethalities may be a good candidate, though while extremely important I think this one is already decently structured and argued relative to some other pieces

I propose prizes of ~$1-10k (weighted by importance) for pieces that do a good job at this, judged by some combination of clarity to those with different tastes from MIRI/Eliezer and how well they maintain a representation of MIRI/Eliezer’s views. I’d be able to commit ~$1-5k myself, and would be happy for commitments from others.

But won’t the writeups not do a good job conveying MIRI’s intuitions?

If a writeup is getting popular but contains serious mistakes, MIRI/Eliezer can chime in and say so.

Isn’t MIRI already trying to communicate their view?

I think so, but it seems great to also have outsiders working on the case. I’d guess there are advantages and disadvantages to both strategies.

Have I talked to MIRI about this?

Yes. Rob suggested going for it and re-iterated the above, saying MIRI’s involvement shouldn’t be a bottleneck to investigating MIRI’s worldview.

Next steps

I’d love to get:

Feedback on whether this is a good idea, and suggestions for revision if so
Volunteers to help actually push the prize to happen and be judged, which I’m not sure I really want to do myself
Commitments to the prize pool
Other ideas for good pieces to transmit

I think it would also be valuable to have someone translate "in the other direction" and take (for example) Paul Christiano's writings and produce vivid, concrete parable-like stories based on them. I think such stories can be useful not just as persuasive tools but also epistemically, as a way of grounding the meaning of abstract definitions of the sort Paul likes to argue in terms of.

Could someone who disagrees with the above statement help me by clarifying what the disagreement is?

Seeing as it has -7 on the agreement vote and that makes me think it should be obvious but it isn't to me.

I still haven’t read most of the sequences and don’t intend to read HPMOR

That's fine, that's what projectlawful is for. It's meant to be the fun thing that you can do instead of looking at TV shows and social media. I like reading it after waking up and before going to bed.

It's ideal in a lot of ways, because it's explicitly designed to have you learn rationality by habit/repetition without any deliberate repetitive effort e.g. taking notes, which is necessary in order to actually get good at turning rationality into extreme competence at life.

The EY self-insert character, Keltham, is much more humble and is genuinely interested in the world and people around him (many of whom are "smarter" than him or vigorously intend to surpass him). He's not preachy, he's an economist; and most of the rationality lessons are just him saying how things are in an alternate timeline (dath ilan), not insisting that they ought to be his way.

I definitely agree that it's a good idea to find ways to use EY's writings to get ahead of the curve and find new opportunities; it would save everyone a lot of time and labor to just implement the original works themselves instead of usual rewriting them in your own words and taking credit for it to advance your status. What you've said about summarization makes sense, but I've tried that and it's a lot harder than it looks; getting rid of the lists of examples and parables makes it harder to digest the content properly. This is an extreme case, it turned the best 25 sequences into notes basically (a great alternative to rereading all 25 since you can do it every morning, but not ideal for the first read).

Maybe such a contest could also require the entrants to describe generally-valuable effective strategies to condense EY's writings?

Is rationalism really necessary to understanding MIRI-type views on AI alignment? I personally find rationalism offputting and I don't think it's very persuasive to say "you have to accept a complex philosophical system and rewire your brain to process evidence and arguments differently to understand one little thing." If that's the deal, I don't think you'll find many takers outside of those already convinced.

In what way are you processing evidence differently from "rationalism"?

I'm probably not processing evidence any differently from "rationalism". But starting an argument with "your entire way of thinking is wrong" gets interpreted by the audience as "you're stupid" and things go downhill from there.

There are definitely such people for sure. The question is whether people who don't want to learn to process evidence correctly (because the idea of having been doing it the wrong way until now offends them) were ever going to contribute to AI alignment in the first place.

Fair point. My position is simply that, when trying to make the case for alignment, we should focus on object level arguments. It's not a good use of our time trying to reteach philosophy when the object level arguments are the crux.

That's generally true... unless both parties process the object-level arguments differently, because they have different rules for updating on evidence.

EY originally blamed failure to agree with his obviously correct arguments about AI on poor thinking skills, then set about to correct that. But other explanations are possible.

Yeah, that's not a very persuasive story to skeptics.

Aren’t turned off by perceived arrogance

One hypothesis I've had is that people with more MIRI-like views tend to be more arrogant themselves. A possible mechanism is that the idea that the world is going to end and that they are the only ones who can save is appealing in a way that shifts their views on certain questions and changes the way they think about AI (e.g. they need less explanation that they are some of the most important people ever, so they spend less time considering why AI might go well by default).

[ETA: In case it wasn't clear, I am positing subconscious patterns correlated with arrogance that lead to MIRI-like views]

Interesting idea. I think it’s possible that a prize is the wrong thing for getting the best final result (but also possible that getting a half decent result is more important than a high variance attempt at optimising for the best result). My thinking is: To do what you’re suggesting to a high standard could take months of serious effort. The idea of someone really competent doing so just for the chance at some prize money doesn’t quite seem right to me… I think there could be people out there who in principle could do it excellently but who would want to know that they’d ‘got the job’ as it were before spending serious effort on it.

At least 50% of the disagreement between people who align more with MIRI/Eliezer and those who align more with opposing clusters of views (Christiano, Garfinkel, etc.) is caused not by rational disagreement, but by more subconscious emotional stuff like tastes, vibes, culture, politics, etc.

Can you say more about what defines the clusters? I disagree with this if you take the top 10 people in each cluster (maybe according to the metric of "general competence * amount of time investigating these views"), and I agree if you're instead thinking of clusters of thousands of people.

Sure, I wasn't clear enough about this in the post (there was also some confusion on Twitter about whether I was only referring to Christiano and Garfinkel rather than any "followers").

I was thinking about roughly hundreds of people in each cluster, with the bar being something like "has made at least a few comments on LW or EAF related to alignment and/or works or is upskilling to work on alignment".

The Sequences is outdated because it introduces the reader to a naturalist worldview as opposed to embedded agency which would make moving to AI alignment easier for new readers. While the core ideas are not about naturalism as much as "the art of rationality", the fabric is too tight-knitted. IE the invididuals that the series of essays is particularly recommended for are overly unlikely to see that the series is knitted from two differents yarns.

a naturalist worldview as opposed to embedded agency

What's the difference?

Maybe this was just me, but I skipped most of the parabels in AI to Zombies, because I did not enjoy his fantasy writing. Has someone considered publishing a no-fiction AI to Zombies volume to appeal to this audience? The highlights from the sequences are a good example of what I am thinking about, but my guess (based on our local rationality group) would be that a nonnegligable portion of people who actually read most of his nonfiction, read the ebook.

10%

11%

12%

13%

14%

15%

16%

17%

18%

19%

20%

21%

22%

23%

24%

25%

26%

27%

28%

29%

peterbarnett (27%),mic (29%)

30%

31%

32%

33%

34%

35%

36%

37%

38%

39%

Ian McKenzie (30%),TurnTrout (35%),aog (35%),sab (37%)

40%

41%

42%

43%

44%

45%

46%

47%

48%

49%

Matthew Barnett (42%)

50%

51%

52%

53%

54%

55%

56%

57%

58%

59%

hath (50%)

60%

61%

62%

63%

64%

65%

66%

67%

68%

69%

Adam Zerner (60%),Alexei (60%),Vivek Hebbar (62%),Eric Neyman (62%)

70%

71%

72%

73%

74%

75%

76%

77%

78%

79%

Daniel Kokotajlo (70%),lalaithion (73%),lc (74%)

80%

81%

82%

83%

84%

85%

86%

87%

88%

89%

Noosphere89 (80%),elifland (80%)

90%

91%

92%

93%

94%

95%

96%

97%

98%

99%

KvmanThinking (95%)