Hi, Anna. The post was very interesting, and I'd be happy to do some 'kibitzing' (had to check the word's meaning, actually), even if I give low credence that anything I'll say might be of use.
I am a relatively late arriver to the world of Rationalism (from about 3 years ago or so), having gone through some of the usual suspects (Julia Galef's Scout Mindset and her podcasts; Chivers' book on Rationalism; HPMOR and some online posts from Scott Alexander; some CFAR materials online, and some youtube videos; not much Less Wrong or The Sequences though, until now). What attracted me to it was my quasi-religious attachment for truth-seeking and for avoiding self-deception, and my desire to become more rational in my thinking and decision-making. I am still reading and exploring. You might remember that I've been one of the assistants to the CFAR test sessions, and would be interested in the future in going a workshop if it doesn't entail having to go across the Atlantic (I live in rural, NW Spain).
Through my eyes, CFAR seems like a group of people who are engaged in the sort of 'right thinking' I aspire to (LW also; perhaps the distinction from the outside is that CFAR seems to be a more educational-oriented project in practice).
A lot of what I see in the test sessions I find confusing, but I don't think it's from any fault of yours. They aren't workshops, and I feel a lot of uncanny valley-ness in that I recognize most of the ideas and terms but haven't really interiorized their meaning. Also, the only correlate your practice brings to mind is something like talk-therapy which, while pretty common in the States, is really unusual over here (the only, and very few people I know who have engaged in something like it do it with psychologists and for medical reasons). This isn’t a criticism so much as a cultural translation problem: from where I sit, it’s not always clear how to distinguish applied rationality coaching from therapeutic modes of engagement, and that makes it harder to know what norms to bring.
I think it would help me a lot to actually go in detail over some of the material you have online (particularly, the CFAR Handbook). I think I just mostly lack the grammar of how these things are supposed to be done.
'Hope for something out loud': I'd hope for a chance in the not-too-remote future to take once of your workshops this side of the ocean. 'Try to speak to why you care rather than rounding to the nearest conceptual category': as I said, I care really personally about truth-finding, truth-seeking and being part of an expanding circle of people who share this frame of mind, which makes me see your efforts, whatever they end up producing, as merit-worthy. And that’s why I care enough to watch closely how this iteration of CFAR actually plays out.
Question: wouldn't how we dealt with Acid Rain and the Ozone layer be counterexamples? In these cases one didn't have a clear deadline, but we did manage to muster the resources and effort to overcome the issues. I would think the issue is not so much just status quo, but actual, generalized understanding of the magnitude of the risk + degree of certainty + actionability. AI risks seem to have big problems with each of those three.
That sounds very reasonable. In the review, I wasn’t consciously trying to play a blame game with Yudkowsky and Soares (I generally think blame is ineffective at producing good outcomes anyway) but rather to articulate a reader’s uncertainty about what their reference class for relevant expertise actually is.
My own naïve take would be something like what you say: people with substantial hands-on technical experience in contemporary AI systems, combined with people who have thought deeply about the theoretical aspects of alignment. My impression is that even within this relatively restrictive class there remains a wide diversity of views, and that these do not, on average, converge on the positions defended in the book.
I only have a superficial understanding of Yudkowsky’s work over the years, but I am aware that he led MIRI for roughly two decades, and that it was a relatively well-funded, full-time research organization explicitly created to work on what was seen as “the real alignment problem” outside of frontier labs. From an outsider’s perspective, however, it is not obvious that MIRI functioned as a place where deep, hands-on technical understanding of AI systems was systematically acquired, even at a smaller or safer scale.
If avoiding frontier labs is justified on the grounds that they accelerate catastrophic risk, then MIRI would seem to have been the natural alternative pathway for developing compensating expertise, yet it is not clear (at least to a non-insider) what concrete forms of technical or empirical understanding were accumulated there over time, or how this translated into transferable expertise about real AI systems as they actually evolved. In fact, from a superficial impression, it is difficult not to come away with the (possibly mistaken) impression that much of the work remained at the level of highly abstract theorizing rather than engagement with concrete systems..
That gap makes it harder, for me to see the absence of conventional credentials as epistemically “screened off” rather than simply displaced.
I mean... if there's one thing I learned by studying literature and literary criticism at uni (spoiler: I didn't learn much, and most of what I did was not very valuable), it is that texts are very seldom completely self-consistent. Still, I think what you say is fair: the authors haven't just given up (if they had, they wouldn't have written the book in the first place), but it feels to me that the solutions they propose are wildly impractical, and perceived as such by the authors, and that this perception likely plays a big role in their P(doom). If the bar for “meaningful risk reduction” is set so high that only globally coordinated, near-wartime restrictions count, then the conclusion of extreme doom follows almost automatically from the premises. I’m not convinced the argument sufficiently explores whether there are intermediate, messy, politically imperfect interventions that could still substantially lower risk without meeting that idealized threshold.
This was a nice and relatively short post as well. I started reading it from what I assume is an anti-realist position (ethics as something constructed, a framework of agreements between rational agents to enable cooperation and mutual benefit, and therefore something mostly procedural and contractual. Probably aligned with Hobbes and Gauthier, once I find the time to read them). I was unsurprised that, having chosen your 5 as the most similar to my views, you described it as 'moral anti-realist'. I have the impression that EAs and perhaps a lot of Rationalists seem to resonate a lot with Utilitarianism. I appreciate any suggestions of older posts to read in this regard (I am new to all this).
Hello there! My name is Manuel and I am a TEFL teacher from Spain. I discovered Rationalism and the Rationality community a few years ago (late 2022) and found myself sharing a lot of what seem to be its core principles, i.e., I think I am quite enthusiastic about all of these, from the User's Guide:
Very briefly, let me add a couple of things more: 1) Some specific information about me that might be useful/relevant, and 2) what Rationalist texts/materials I've been reading and pondering about.
Me
If I had to define one goal that I find really valuable and personally meaningful, I'd say it is ruthless truth-seeking, which is what probably has attracted me the most towards Rationalism. I have a really intense desire fo discovering the true and for trying to avoid self-deceptions.
My educational background is strongly in the Humanities (which likely accounts for some of my biases, interests and a prioris on how I look at things, including a deep skepticism of language and truth claims, a liking for truths as ends in themselves and a distaste for the practical and pragmatic). In the last decade I rediscovered a fascination and love for math and am trying to self-teach myself an undergraduate curriculum of it.
Intellectual engagement with Rationalism
Besides some cross-posted post in the EA forum, I'd say the first full-length Rationalist text I read was Eliezer Yudkowsky's HPMOR, which I found interesting, but also not in what I'd feel as an optimal setting (I haven't read any Harry Potter books or have a keen interest on it, and I had never read a long fandom text before, although since then I have read some 'rationalist fiction' with enjoyment, like Scott Alexander's Unsong, Alexander Wales's Worth the Candle or Nostalgebraist's The Northern Caves).
One book I found very illuminating about the community, its goals and thoughts, was Tom Chivers' The Rationalist's Guide to the Galaxy (he also had a very good popular introduction to Bayes's Theorem I found useful) and Julia Galef's The Scout Mindset. Other, more periferally adjacent material I've also read includes Joseph Henrich, introductions to Game Theory, The Strategy of Conflict and some economics, biology and psychology stuff (Makiw, Haidt, Dawkins, Seth, Geoffrey Miller). On AI and its risks, Christian's The Alignment Problem, Ord's The Precipice and Ananthaswamy's Why Machines Learn. Yudkowsky's and Nate's new book should be arriving today in the mail. I would like to start reading The Sequences but am a bit loath to do it here, as I feel I can only seriously concentrate when reading in print, so will be trying to get a paperback copy of Rationality: From AI to Zombies (if difficult to get, I'll resign myself to using my e-reader). Beyond books, I've been listening to the podcasts from Rationally Speaking -am more than halfway through them by now- and some short videos online (Rational Animations, some podcasts with Yudkowsky and the CFAR and Singularity Institute videos in which Anna Salamon appears. I have to stress how enlightening I've found the latter).
I think that is all for the moment! Hope to see and read you around.
I liked your post, and I’m probably the sort of person predisposed to like it (for context: history has been my favorite subject since I was 11; it was my first degree at university; I’ve read widely both in historical scholarship and in the meta-justifications for studying it). I’ve also been deeply steeped in the broader humanities, and that background has shaped how I see the world and what my preferences and the things I care about are. Still, statements like “the humanities exist to improve our minds” and “history improves us” strike me as mostly normative aspirations rather than accurate descriptions of how interaction with those fields typically functions.
The core difficulty with the “history as context” argument is not merely that context is selected, but that the criteria of selection are instrumentally shaped. In actual educational practice, the organizing framework is almost always the nation-state; history becomes a kind of secular civic religion that presents the polity as an inevitable, self-evident, and temporally continuous subject. That isn’t arbitrary so much as ideological: the selections are optimized for legitimacy, cohesion, and identity formation, not for epistemic clarity. The same issue appears in “history as memory,” where decisions about what is worth remembering—and how to interpret what is remembered—follow political, cultural, and institutional imperatives. Textbook committees and curricula do not behave like disinterested archivists; they behave like legitimacy-producing institutions.
I also have to admit that I don’t have strong evidence that history has “improved” me in any of the virtues usually invoked. I’ve enjoyed reading it, and learning about the actions, thoughts, constraints, and delusions of past humans. But I don’t think it has given me superior judgment, foresight, epistemic humility, or civic virtue relative to STEM peers. People often claim that the humanities make one more open, curious, or empathetic toward other societies; that they illuminate institutional dynamics or the grammar of civilizations. Maybe that can happen, but I’m not confident it did in my case. It is extremely easy to adopt a single interpretive lens—in my case for many years it was Marxism—and read everything through that filter, which produces narrative satisfaction without necessarily producing accurate models of reality.
This isn’t an argument against history or the humanities. It’s an argument against a certain idealized story we tell about them. If history improves people, the mechanism seems non-trivial: it requires meta-reflective skills, comparative reasoning, and some capacity for model-building. Those are not reliably taught within the discipline, and are often actively undermined by curriculum structures designed for identity formation rather than truth-seeking.