This feels kind of backwards, in the sense that I think something like 2032-2037 is probably the period that most people I know who have reasonably short timelines consider most likely.
AI 2027 is a particularly aggressive timeline compared to the median, so if you choose 2028 as some kind of Schelling time to decide whether things are markedly slower than expected then I think you are deciding on a strategy that doesn't make sense by like 80% of the registered predictions that people have.
Even the AI Futures team themselves have timelines that put more probability mass on 2029 than 2027, IIRC.
Of course, I agree that in some worlds AI progress has substantially slowed down, and we have received evidence that things will take longer, but "are we alive and are things still OK in 2028?" is a terrible way to operationalize that. Most people do not expect anything particularly terrible to have happened by 2028!
My best guess, though I am far from confident, is that things will mostly get continuously more crunch-like from here, as things continue to accelerate. The key decision-point in my model at which things might become a bit different is if we hit the end of the compute o...
Situational Awareness and AI 2027 have been signal boosted to normal people more than other predictions, though. Like, they both have their own website, AI 2027 has a bunch of fancy client-side animation and Scott Alexander collaborated, and someone made a Youtube video on AI 2027.
While AI safety people have criticized the timeline predictions to some extent, there hasn't been much in-depth criticism (aside from the recent very long post on AI 2027), the general sentiment on their timelines seems positive (although Situational Awareness has been criticized for contributing to arms race dynamics).
I get that someone who looks at AI safety people's timelines in more detail would get a different impression. Though, notably, Metaculus lists Jan 2027 as a "community prediction" of "weakly general AI". Sure, someone could argue that weakly general AI doesn't imply human-level AGI soon after, but mostly when I see AI safety people point to this Metaculus market, it's as evidence that experts believe human-level AGI will arrive in the next few years, there is not emphasis on the delta between weakly general AI and human-level AGI.
So I see how an outsider would see more 2027-2029 timelines f...
Yep, agree that there is currently a biased coverage towards very short timelines. I think this makes sense in that the worlds where things are happening very soon are the worlds that from the perspective of a reasonable humanity require action now.[1]
I think despite the reasonable justification for focusing on the shorter timelines worlds for decision-making reasons, I do expect this to overall cause a bunch of people to walk away with the impression that people confidently predicted short timelines, and this in turn will cause a bunch of social conflict and unfortunate social accounting to happen in most worlds.
I on the margin would be excited to collaborate with people who would want to do similar things to AI 2027 or Situational Awareness for longer timelines.
I.e. in as much as you model the government as making reasonable risk-tradeoffs in the future, the short timeline worlds are the ones that require intervention to cause changes in decision-making now.
I am personally more pessimistic about humanity doing reasonable things, and think we might just want to grieve over short timeline worlds, but I sure don't feel comfortable telling other people to not ring the ala
Even if it does make sense strategically to put more attention on shorter timelines, that sure does not seem to be what actually drives the memetic advantage of short forecasts over long forecasts. If you want your attention to be steered in strategically-reasonable ways, you should probably first fully discount for the apparent memetic biases, and then go back and decide how much is reasonable to re-upweight short forecasts. Whatever bias the memetic advantage yields is unlikely to be the right bias, or even the right order of magnitude of relative attention bias.
…people who were worried because of listening to this algorithm should chill out and re-evaluate.
And communication strategies based on appealing to such people's reliance on those algorithms should also re-evaluate.
E.g., why did folk write AI 2027? Did they honestly think the timeline was that short? Were they trying to convey a picture that would scare people with something on a short enough timeline that they could feel it?
If the latter, we might be doing humanity a disservice, both by exhausting people from something akin to adrenal fatigue, and also as a result of crying wolf.
I think Daniel also just has shorter timelines than most (which is correlated for wanting to more urgently communicate that knowledge).
I think something like 2032-2037 is probably the period that most people I know who have reasonably short timelines consider most likely.
I honestly didn't know that. Thank you for mentioning it. Almost everything I hear is people worrying about AGI in the next few years, not AGI a decade from now.
if you choose 2028 as some kind of Schelling time to decide whether things are markedly slower than expected then I think you are deciding on a strategy that doesn't make sense by like 80% of the registered predictions that people have.
Just to check: you're saying that by 2028 something like 80% of the registered predictions still won't have relevant evidence against them?
As both a pragmatic matter and a moral one, I really hope we can find ways of making more of those predictions more falsifiable sooner. If anything in the vague family of what I'm saying is right, but if folk won't halt, melt, & catch fire for at least a decade, then that's an awful lot of pointless suffering and wasted talent. I'm also skeptical that there'll actually be a real HMC in a decade; if the metacognitive blindspot type I'm pointing at is active, then waiting a decade is part of the strategy for not having ...
The key decision-point in my model at which things might become a bit different is if we hit the end of the compute overhang, and you can't scale up AI further simply by more financial investment, but instead now need to substantially ramp up global compute production, and make algorithmic progress, which might markedly slow down progress.
I think compute scaling will slow substantially by around 2030 (edit: if we haven't seen transformative AI). (There is some lag, so I expect the rate at which capex is annually increasing to already have slowed by mid 2028 or so, but this will take a bit before it hits scaling.)
Also, it's worth noting that most algorithmic progress AI companies are making is driven by scaling up compute (because scaling up labor in an effective way is so hard: talented labor is limited, humans parallelize poorly, and you can't pay more to make them run faster). So, I expect algorithmic progress will also slow around this point.
All these factors make me think that something like 2032 or maybe 2034 could be a reasonable Schelling time (I agree that 2028 is a bad Schelling time), but IDK if I see that much value in having a Schelling time (I think you probably agree with this).
In practice, we should be making large updates (in expectation) over the next 5 years regardless.
I think compute scaling will slow substantially by around 2030
There will be signs if it slows down earlier, it's possible that in 2027-2028 we are already observing that there is no resolve to start building 5 GW Rubin Ultra training systems (let alone the less efficient but available-a-year-earlier 5 GW non-Ultra Rubin systems), so that we can update then already, without waiting for 2030.
This could result from some combination of underwhelming algorithmic progress, RLVR scaling not working out, and the 10x compute scaling from 100K H100 chips to 400K GB200 chips not particularly helping, so that AIs of 2027 fail to be substantially more capable than AIs of 2025.
But sure, this doesn't seem particularly likely. And there will be even earlier signs that the scaling slowdown isn't happening before 2027-2028 if the revenues of companies like OpenAI and Anthropic keep sufficiently growing (in 2025-2026), though most of these revenues might also be indirectly investment-fueled, threatening to evaporate if AI stops improving substantially.
my synthesis is I think people should chill out more today and sprint harder as the end gets near (ofc, some fraction of people should always be sprinting as if the end is near, as a hedge, but I think it should be less than now. also, if you believe the end really is <2 years away then disregard this). the burnout thing is real and it's a big reason for me deciding to be more chill. and there's definitely some weird fear driven action / negative vision thing going on. but also, sprinting now and chilling in 2028 seems like exactly the wrong policy
As a datapoint, none of this chilling out or sprinting hard discussion resonates with me. Internally I feel that I've been going about as hard as I know how to since around 2015, when I seriously got started on my own projects. I think I would be working about similarly hard if my timelines shortened by 5 years or lengthened by 15. I am doing what I want to do, I'm doing the best I can, and I'm mostly focusing on investing my life into building truth-seeking and world-saving infrastructure. I'm fixing all my psychological and social problems insofar as they're causing friction to my wants and intentions, and as a result I'm able to go much harder today than I was in 2015. I don't think effort is really a substantially varying factor in how good my output is or impact on the world. My mood/attitude is not especially dour and I'm not pouring blind hope into things I secretly know are dead ends. Sometimes I've been more depressed or had more burnout, but it's not been much to do with timelines and more about the local environment I've been working in or internal psychological mistakes. To be clear, I try to take as little vacation time at work as I psychologically can (like 2-4 weeks ...
Burnout is not a result of working a lot, it's a result of work not feeling like it pays out in ape-enjoyableness[citation needed]. So they very well could be having a grand ol time working a lot if their attitude towards intended amount of success matches up comfortably with actual success and they find this to pay out in a felt currency which is directly satisfying. I get burned out when effort => results => natural rewards gets broken, eg because of being unable to succeed at something hard, or forgetting to use money to buy things my body would like to be paid with.
to be clear, I am not intending to claim that you wrote this post believing that it was wrong. I believe that you are trying your best to improve the epistemics and I commend the effort.
I had interpreted your third sentence as still defending the policy of the post even despite now agreeing with Oliver, but I understand now that this is not what you meant, and that you are no longer in favor of the policy advocated in the post. my apologies for the misunderstanding.
I don't think you should just declare that people's beliefs are unfalsifiable. certainly some people's views will be. but finding a crux is always difficult and imo should be done through high bandwidth talking to many people directly to understand their views first (in every group of people, especially one that encourages free thinking among its members, there will be a great diversity of views!). it is not effective to put people on blast publicly and then backtrack when people push back saying you misunderstood their position.
I realize this would be a lot of work to ask of you. unfortunately, coordination is hard. it's one of the hardest things in the world. I don't think you have any moral obligation to do this...
there exists a single, clever insight which would close at least half the remaining distance to AGI
By my recollection, this specific possibility (and neighboring ones, like "two key insights" or whatever) has been one of the major drivers of existential fear in this community for at least as long as I've been part of it. I think Eliezer expressed something similar. Something like "For all we know, we're just one clever idea away from AGI, and some guy in his basement will think of it and build it. That could happen at any time."
I don't know your reasons for thinking we're just one insight away, and you explicitly say you don't want to present the arguments here. Which makes sense to me!
I just want to note that from where I'm standing, this kind of thinking and communicating sure looks like a possible example of the type of communication pattern I'm talking about in the OP. I'm actually not picky about the trauma model specifically. But it totally fits the bill of "Models spread based significantly on how much doom they seem to plausibly forecast." Which makes some sense if there really is a severe doom you're trying to forecast! But it also puts a weird evolutionary incentive on th...
and neighboring ones, like “two key insights” or whatever
I... kinda feel like there's been one key insight since you were in the community? Specifically I'm thinking of transformers, or whatever it is that got us from pre-GPT era to GPT era.
Depending on what counts as "key" of course. My impression is there's been significant algorithmic improvements since then but not on the same scale. To be fair it sounds like Random Developer has a lower threshold than I took the phrase to mean.
But I do think someone guessing "two key insights away from AGI" in say 2010, and now guessing "one key insight away from AGI", might just have been right then and be right now?
(I'm aware that you're not saying they're not, but it seemed worth noting.)
I want to acknowledge that if this isn't at all what's going on in spaces like Less Wrong, it might be hard to demonstrate that fact conclusively. So if you're really quite sure that the AI problem is basically as you think it is, and that you're not meaningfully confused about it, then it makes a lot of sense to ignore this whole consideration as a hard-to-falsify distraction.
"Conclusively" is usually the bar for evidence usable to whack people over the head when they're really determined to not see reality. If one is in fact interested in seeing reality, there's already plenty of evidence relevant to the post's model.
One example: forecasting AI timelines is an activity which is both strategically relevant to actual AI risk, and emotionally relevant to people drawn to doom stories. But there's a large quantitative difference in how salient or central timelines are, strategically vs emotionally, relative to other subtopics. In particular: I claim that timelines are much more relatively salient/central emotionally than they are strategically. So, insofar as we see people focused on timelines out of proportion to their strategic relevance (relative to other subtopics), that lends sup...
Obvious enough but worth saying explicitly: It's not just impacts on epistemology, but also collective behavior. From https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#My_views_on_strategy :
Suppose there is a consensus belief, and suppose that it's totally correct. If funders, and more generally anyone who can make stuff happen (e.g. builders and thinkers), use this totally correct consensus belief to make local decisions about where to allocate resources, and they don't check the global margin, then they will in aggregrate follow a portfolio of strategies that is incorrect. The make-stuff-happeners will each make happen the top few things on their list, and leave the rest undone. The top few things will be what the consensus says is most important——in our case, projects that help if AGI comes within 10 years. If a project helps in 30 years, but not 10 years, then it doesn't get any funding at all. This is not the right global portfolio; it oversaturates fast interventions and leaves slow interventions undone.
There are two specific shoes that are yet to drop, and only one of them is legibly timed, AI-2027 simply takes the premise that both fall in 2027. There's funding-fueled frontier AI training system scaling that will run out (absent AGI) in 2027-2029, which at the model level will be felt by 2028-2030. And there's test-time adaptation/learning/training that accrues experience/on-boarding/memory of a given AI instance in the form of weight updates rather than observable-by-the-model artifacts outside the model.
So the currently-much-faster funding-fueled scaling has an expiration date, and if it doesn't push capabilities past critical thresholds, then absence of this event is the reason to expect longer timelines. But test-time learning first needs to actually happen to be observed as by itself non-transformative, and as a basic research puzzle it might take an unknown number of years (again, absent AGI from scaling without test-time learning). If everyone is currently working on it and it fails to materialize by 2028, that's also some sort of indication that it might take nontrivial time.
This post reads to me as a significantly improved version of the earlier "Here's the exit", in that this one makes a similar point but seems to have taken the criticisms of that post to heart and seems a lot more epistemically reasonable. And I want to just explicitly name that and say that I appreciate it.
(Also I agree with a lot in the post, though some of the points raised in the responses about 2028 maybe being a poor choice for this also sound valid.)
There's another way in which pessimism can be used as a coping mechanism: it can be an excuse to avoid addressing personal-scale problems. A belief that one is doomed to fail, or that the world is inexorably getting worse, can be used as an excuse to give up, on the grounds that comparatively small-scale problems will be swamped by uncontrollable societal forces. Compared to confronting those personal-scale problems, giving up can seem very appealing, and a comparison to a large-scale but abstract problem can act as an excuse for surrender. You probably know someone who spends substantial amounts of their free time watching videos, reading articles, and listening to podcasts that blame all of the world's problems on "capitalism," "systemic racism," "civilizational decline," or something similar, all while their bills are overdue and dishes pile up in their sink.
This use of pessimism as a coping mechanism is especially pronounced in the case of apocalypticism. If the world is about to end, every other problem becomes much less relevant in comparison, including all those small-scale problems that are actionable but unpleasant to work on. Apocalypticism can become a blanket pret...
There's some sound logic to what folk are saying. I think there's a real concern. But the desperate tone strikes me as… something else. Like folk are excited and transfixed by the horror.
For what it's worth, I think pre-2030 AGI Doom is pretty plausible[1], and this doesn't describe my internal experience at all. My internal experience is of wanting to chill out and naturally drifting in that direction, only to be periodically interrupted by OpenAI doing something attention-grabbing or another bogus "innovator LLMs!" paper being released, forcing me to go look if I were wrong to be optimistic after all. That can be stressful, whenever that happens, but I'm not really feeling particularly doom-y or scared by default. (Indeed, I think fear is a bad type of motivation anyway. Hard agree with your "shared positive vision" section.)
I'm not sure whether that's supposed to be compatible with your model of preverbal trauma...?
I would also like to figure out some observable which we'd be able to use to distinguish between "doom incoming" and "no doom yet" as soon as possible. But unfortunately, most observables that look like they fit would also be bogus. Vladimir Nesov's analysis is the be...
How about we learn to smile while saving the world? Saving the world doesn't strike me as strictly incompatible with having fun, so let's do both? :)
The post proposes that LessWrong could tackle alignment in more skillfull ways which is a wholesome thought, but I feel that the post also casts doubt on the project of alignment itself ; I want to push back on that.
It won't become less important to prevent the creation of harmful technologies in 2028, or in any year for that matter. Timelines and predictions don't feel super relevant here.
We know that AGI can be dangerous if created without proper understanding and that fact does not change with time or timelines, so LW should still aim for:
If the current way of advancing towards the goal is sub-optimal, giving up on the goal is not the only answer, we can also change the way we go about it. Since getting AGI right is important, not giving up and changing the way we go about it seems like the better option (all this predicated on the snob and doommish depictions in the post being accurate).
I agree that fixating on doom is often psychologically unhealthy and pragmatically not useful. I also agree with those commenters pointing out that 2028 probably will look mostly normal, even under many pretty short timelines scenarios.
So... why advocate waiting until 2028? It is entirely possible to chill out now, without sacrificing any actual work you're doing to make the future go better, while also articulating or developing a more positive vision of the future.
It took me about a decade to get through the combined despair of learning about x-risk plus not fully living up to some other things I once convinced myself I was morally obligated to do to consider myself a good person. I'm now a more useful, pleasant, happy, and productive person, more able to actually focus on solving problems and taking actions in the world.
One is, maybe we can find a way to make these dire predictions less unfalsifiable. Not in general, but specifically AI 2027. What differences should we expect to see if (a) the predictions were distorted due to the trauma mechanism I describe in this post vs. (b) the act of making the predictions caused them not to come about?
If it turns out that the predictions (which ones?) were distorted, I can't think of a reason to privilege the hypothesis that they were distorted by the trauma mechanism in particular.
Come 2028, I hope Less Wrong can seriously consider for instance retiring terms like "NPC" and "normie", and instead adopt a more humble and cooperative attitude toward the rest of the human race.
I am interested in links to the most prominent use of these terms (e.g. in highly-upvoted posts on LW, or by high-karma users). I have a hypothesis that it's only really used on the outskirts or on Twitter, and not amongst those who are more respected.
I certainly don't use "NPC" (I mean I've probably ever used it, but I think probably only a handful of times) because it aggressively removes agency from people, and I think I've only used "normie" for the purpose of un-self-serious humor – I think it's about as anti-helpful term as "normal person" which (as I've written before) I believe should almost always be replaced by referring to a specific population.
A couple remarks:
I'm skeptical of the the trauma model. I think personally I'm more grim and more traumatized than most people around here, and other people are too happy and not neurotic enough to understand how fucked alignment research is; and yet I have much longer timelines than most people around here. (Not saying anyone should become less happy or more neurotic lol.)
Re/ poor group epistemology, we'll agree that the overall phenomenon is multifactorial. Partially it's higher-order emergent deference. See "Dangers of deference".
I agree that people wan...
Those who believe the world will end in 2027 should commit to chill out in 2028.
Those who believe the world will end in 2030 or later should take a break in 2026, and then continue working.
Registering now that my modal expectation is that the situation will mostly look the same in 2028 as it does today. (To give one example from AI 2027, scaling neuralese is going to be hard, and while I can imagine a specific set of changes that would make it possible, it would require changing some fairly fundamental things about model architecture which I can easily imagine taking 3 years to reach production. And neuralese is not the only roadblock to AGI.)
I think one of your general points is something like "slow is smooth, smooth is fast" and also "cooperative is smooth, smooth is fast", both of which I agree with. But the whole "trauma" thing is too much like Bulverism for my taste.
You know, I’m kind of a sceptic on AI doom. I think mst likely paths we end up ok.
But … this feeling that this post talks about. This feeling that really has nothing to do with AI … yes, ok, I feel that sometimes.
I don’t know man. I hope that the things I have done during my time on earth will be of use to someone. That’s all, I’d like, really.
I was a little surprised by "fatality rate wasn't that high", so I looked it up, and Wikipedia ranks it as the fifth-worst epidemic/pandemic of all time by number of deaths, right behind the Black Death. As a percentage of world population, looks like it'd be like 0.2-0.4%? I'm honestly not sure where that leaves me personally re: "dodged a bullet", just thought I'd share.
I personally found this post less annoying than your previous post on similar themes.
That one felt, to me, like it was more aggressively projecting a narrative onto me and us. This post feels, to me, feels less like it's doing that, and so is more approachable.
I don't know man, seems a lot less plausible now that we have the CAIS letter and inference-time compute started demonstrating a lot of concrete examples of AIs misbehaving[1].
So before the CAIS letter there were some good outside view arguments and in the ChatGPT 3.5-4 era alignment-by-default seemed quite defensible[2], but now we're in an era where neither of these is the case, so your argument pulls a lot less of a punch.
The era when it was just pre-training compute being scaled made it look like we had things a lot more under control than we did. I w
You can literally try to find out how bad people feel. Do x-riskers feel bad? Do people feel worse once they get convinced AI x-risk is a thing, or better once they get convinced it isn't?
This is the PANAS test, a simple mood inventory that's a standard psychological instrument. If you don't trust self-report, you can use heart rate or heart rate variability, which most smartwatches will measure.
Now, to some degree, there's a confounding thing where people like you (very interested in exercise and psychology) might feel better than someone who doesn't have that focus, and all things equal maybe people who focus on x-risk are less likely to focus on exercise/meditation/circling/etc.
So the other thing is, if people who believe in x-risk take up a mental/physical health practice, you can ask if it makes them feel better, and also if it makes them more likely to stop believing in x-risk.
Hey, I think I'm the target audience of this post, but it really doesn't resonate well with me.
Here are some of my thoughts/emotions about it:
5. Wa...
I do think that it's valuable to have some kind of deadline which forces us to admit that our timelines were too pessimistic and consider why. However, as other commenters have pointed out I think 2028 is a bit early for that, and I'm not convinced that the traumatized infant model would be the primary reason.
I do have considerable sympathy for the view in this post that the feeling we’re about to all die is largely decoupled from whether we are, in fact, about to die. There are potentially false negatives as well as false positives here.
I agree focusing on the negative, and on how to avoid stuff rather how to do stuff is counterproductive. But what is this positive vision of the world we should strive for? :-)
I was just explaining this post to my partner. Now, although I put AI extinction as low probability, I have a thyroid condition. Usually treatable: drugs like carbimazole, radio iodine, surgery etc. in my case, complications make things somewhat worse than is typical. So, she just asked how to rate how likely I think it is I don’t, personally, make it to 2028 for medical reasons, I’m like, idk, I guess maybe 50% chance I don’t make it that far. I shall be pleasantly surprised if I make it. Kind of surprised I made it to July this year, to be honest.
I do not expect us to be all dead by 2028.
2028 outcomes I think likely:
A) LLMs hit some kind of wall (e.g. only so much text to train on), and we don’t get AGI.
B) We have, approximately, AGI, but we’re not dead yet. The world is really strange though.
Outcomes (b) either works out ok, or we died some time rather later than 2028.
The gist is to frighten people into action. Usually into some combo of (a) donating money and (b) finding ways of helping to frighten more people the same way. But sometimes into (c) finding or becoming promising talent and funneling them into AI alignment research.
If spending $0.1 billion on AI risk is too much money and promising talent, then surely spending $800 billion on military defence is a godawful waste of promising talent. Would you agree that the US should cut its military budget by 99%, down to $8 billion?
Similar risk at 2028 doesn't prove a lo
I'm close to getting a postverbal trauma from having to observe all the mental gymnastics around the question of whether building a superintelligence without having reliable methods to shape its behavior is actually dangerous. Yes, it is. No, that fact does not depend on whether Hinton, Bengio, Russell, Omohundro, Bostrom, Yudkowsky, et al. were held as a baby.
Let's discuss for now, and then check in about it in 31 months.
I really don't like these kind of statements because it's like a null bet. Either the world has gone to hell and nobody cares about this article or author has "I was correct, told ya" rights. I think these kind of statements should not be made in the context of existential risk.
but you need to form this not like any other argument but like "first time in history of earth life, a species has created a new superior species". I think all these refutals are missing this specific point. This time is different.
I'll explain my reasoning in a second, but I'll start with the conclusion:
I think it'd be healthy and good to pause and seriously reconsider the focus on doom if we get to 2028 and the situation feels basically like it does today.
I don't know how to really precisely define "basically like it does today". I'll try to offer some pointers in a bit. I'm hoping folk will chime in and suggest some details.
Also, I don't mean to challenge the doom focus right now. There seems to be some good momentum with AI 2027 and the Eliezer/Nate book. I even preordered the latter.
But I'm still guessing this whole approach is at least partly misled. And I'm guessing that fact will show up in 2028 as "Oh, huh, looks like timelines are maybe a little longer than we once thought. But it's still the case that AGI is actually just around the corner…."
A friend described this narrative phenomenon as something like the emotional version of a Shepard tone. Something that sure seems like it's constantly increasing in "pitch" but is actually doing something more like looping.
(The "it" here is how people talk about the threat of AI, just to be clear. I'm not denying that AI has made meaningful advances in the last few years, or that AI discussion became more mainstream post LLM explosion.)
I'll spell out some of my reasoning below. But the main point of my post here is to be something folk can link to if we get to 2028 and the situation keeps seeming dire in basically the same increasing way as always. I'm trying to place something loosely like a collective stop loss order.
Maybe my doing this will be irrelevant. Maybe current efforts will sort out AI stuff, or maybe we'll all be dead, or maybe we'll be in the middle of a blatant collapse of global supply chains. Or something else that makes my suggestion moot or opaque.
But in case it's useful, here's a "pause and reconsider" point. Available for discussion right now, but mainly as something that can be remembered and brought up again in 31 months.
Okay, on to some rationale.
Sometimes my parents talk about how every generation had its looming terror about the end of the world. They tell me that when they were young, they were warned about how the air would become literally unbreathable by the 1970s. There were also dire warnings about a coming population collapse that would destroy civilization before the 21st century.
So their attitude upon hearing folks' fear about overpopulation, and handwringing around Y2K, and when Al Gore was beating the drum about climate change, and terror about the Mayan calendar ending in 2012, was:
Oh. Yep. This again.
Dad would argue that this phenomenon was people projecting their fear of mortality onto the world. He'd say that on some level, most people know they're going to die someday. But they're not equipped to really look at that fact. So they avoid looking, and suppress it. And then that unseen yet active fear ends up coloring their background sense of what the world is like. So they notice some plausible concerns but turn those concerns into existential crises. It's actually more important to them at that point that the problems are existential than that they're solved.
I don't know that he's right. In particular, I've become a little more skeptical that it's all about mortality.
But I still think he's on to something.
It's hard for me not to see a similar possibility when I'm looking around AI doomerism. There's some sound logic to what folk are saying. I think there's a real concern. But the desperate tone strikes me as… something else. Like folk are excited and transfixed by the horror.
I keep thinking about how in the 2010s it was extremely normal for conversations at rationalist parties to drift into existentially horrid scenarios. Things like infinite torture, and Roko's Basilisk, and Boltzmann brains. Most of which are actually at best irrelevant to discuss. (If I'm in a Boltzmann brain, what does it matter?)
Suppose that what's going on is, lots of very smart people have preverbal trauma. Something like "Mommy wouldn't hold me", only from a time before there were mental structures like "Mommy" or people or even things like object permanence or space or temporal sequence. Such a person might learn to embed that pain such that it colors what reality even looks like at a fundamental level. It's a psycho-emotional design that works something like this:
If you imagine that there's something like a traumatized infant inside such people, then its primary drive is to be held, which it does by crying. And yet, its only way of "crying" is to paint the subjective experience of world in the horror it experiences, and to use the built-up mental edifice it has access to in order to try to convey to others what its horror is like.
If you have a bunch of such people getting together, reflecting back to one another stuff like
OMG yes, that's so horrid and terrifying!!!
…then it feels a bit like being heard and responded to, to that inner infant. But it's still not being held, and comforted. So it has to cry louder. That's all it's got.
But what that whole process looks like is, people reflecting back and forth how deeply fucked we are. Getting consensus on doom. Making the doom seem worse via framing effects and focusing attention on the horror of it all. Getting into a building sense of how dire and hopeless it all is, and how it's just getting worse.
But it's from a drive to have an internal agony seen and responded to. It just can't be seen on the inside as that, because the seeing apparatus is built on top of a worldview made of attempts to get that pain met. There's no obvious place outside the pain from which to observe it.
I'm not picky about the details here. I'm also not sure this is whatsoever what's going on around these parts. But it strikes me as an example type in a family of things that's awfully plausible.
It's made even worse by the fact that it's possible to name real, true, correct problems with this kind of projection mechanism. Which means we can end up in an emotional analogue of a motte-and-bailey fallacy: attempts to name the emotional problem get pushed away because naming it makes the real problem seem less dire, which on a pre-conceptual level feels like the opposite of what could possibly help. And the arguments for dismissing the emotional frame get based on the true fact that the real problem is in fact real. So clearly it's not just a matter of healing personal trauma!
(…and therefore it's mostly not about healing personal trauma, so goes the often unstated implication (best as I can tell).)
But the invitation is to address the doom feeling differently, not to ignore the real problem (or at least not indefinitely). It's also to consider the possibility that the person in question might not be perceiving the real problem objectively because their inner little one might be using it as a microphone and optimizing what's "said" for effect, not for truth.
I want to acknowledge that if this isn't at all what's going on in spaces like Less Wrong, it might be hard to demonstrate that fact conclusively. So if you're really quite sure that the AI problem is basically as you think it is, and that you're not meaningfully confused about it, then it makes a lot of sense to ignore this whole consideration as a hard-to-falsify distraction.
But I think that if we get to 2028 and we see more evidence of increasing direness than of actual manifest doom, it'll be high time to consider that internal emotional work might be way, way, way more central to creating something good than is AI strategizing. Not because AI doom isn't plausible, but because it's probably not as dire as it always seems, and there's a much more urgent problem demanding attention first before vision can become clear.
In particular, it strikes me that the AI risk community orbiting Less Wrong has had basically the same strategy running for about two decades. A bunch of the tactics have changed, but the general effort occurs to me as the same.
The gist is to frighten people into action. Usually into some combo of (a) donating money and (b) finding ways of helping to frighten more people the same way. But sometimes into (c) finding or becoming promising talent and funneling them into AI alignment research.
That sure makes sense if you're trapped in a house that's on fire. You want the people trapped with you to be alarmed and to take action to solve the problem.
But I think there's strong reason to think this is a bad strategy if you're trapped in a house that's slowly sinking into quicksand over the course of decades. Not because you'll all be any less dead for how long it takes, but because activating the fight-or-flight system for that long is just untenable. If everyone gets frightened but you don't have a plausible pathway to solving the problem in short order, you'll end up with the same deadly scenario but now everyone will be exhausted and scared too.
I also think it's a formula for burnout if it's dire to do something about a problem but your actions seem to have at best no effect on said problem.
I've seen a lot of what I'd consider unwholesomeness over the years that I think is a result of this ongoing "scare people into action about AI risk" strategy. A ton of "the ends justify the means" thinking, and labeling people "NPCs", and blatant Machiavellian tactics. Inclusion and respect with words but an attitude of "You're probably not relevant enough for us to take seriously" expressed with actions and behind closed doors. Amplifying the doom message. Deceit about what projects are actually for.
I think it's very easy to lose track of wholesome morality when you're terrified. And it can be hard to remember in your heart why morality matters when you're hopeless and burned out.
(Speaking from experience! My past isn't pristine here either.)
Each time it's seemed like "This could be the key thing! LFG!!!" So far the results of those efforts seem pretty ambiguous. Maybe a bunch of them actually accelerated AI timelines. It's hard to say.
Maybe this time is different. With AI in the Overton window and with AI 2027 going viral, maybe Nate & Eliezer's book can shove the public conversation in a good direction. So maybe this "scare people into action" strategy will finally pay off.
But if it's still not working when we hit 2028, I think it'll be a really good time to pause and reconsider. Maybe this direction is both ineffective and unkind. Not as a matter of blame and shame; I think it has made sense to really try. But 31 months from now, it might be really good to steer this ship in a different direction, as a pragmatic issue of sincerely caring for what's important to us all going forward.
I entered the rationality community in 2011. At that time there was a lot of excitement and hope. The New York rationalist scene was bopping, meetups were popping up all over the world, and lots of folk were excited about becoming . MIRI (then the Singularity Institute for Artificial Intelligence) was so focused on things like the Visiting Summer Fellows Program and what would later be called the Rationality Mega Camp that they weren't getting much research done.
That was key to what created CFAR. There was a need to split off "offer rationality training" from "research AI alignment" so that the latter could happen at all.
(I mean, I'm sure some was happening. But it was a pretty big concern at the time. Some big donors were getting annoyed that their donations weren't visibly going to the math project Eliezer was so strongly advocating for.)
At the time there was shared vision. A sense that more was possible. Maybe we could create a movement of super-competent super-sane people who could raise the sanity waterline in lots of different domains, and maybe for the human race as a whole, and drown out madness everywhere that matters. Maybe powerful and relevant science could become fast. Maybe the dreams of a spacefaring human race that mid 20th century sci-fi writers spoke of could become real, and even more awesome than anyone had envisioned before. Maybe we can actually lead the charge in blessing the universe with love and meaning.
It was vague as visions go. But it was still a positive vision. It drove people to show up to CFAR's first workshops for instance. Partly out of fear of AI, sure, but at least partly out of excitement and hope.
I don't see or hear that kind of focus here anymore. I haven't for a long time.
I don't just mean there's cynicism about whether we can go forth and create the Art. I watched that particular vision decay as CFAR muddled along making great workshops but turning no one into Ender Wiggin. It turns out we knew how to gather impressive people but not how to create them.
But that's just one particular approach for creating a good and hopeful future.
What I mean is, nothing replaced that vision.
I'm sure some folk have shared their hopes. I have some. I've heard a handful of others. I think Rae's feedbackloop-first rationality is a maybe promising take on the original rationality project.
But there isn't anything like a collective vision for something good. Not that I'm aware of.
What I hear instead is:
It reminds me of this:
Very young children (infants & toddlers) will sometimes get fixated on something dangerous to them. Like they'll get a hold of a toxic marker and want to stick it in their mouth. If you just stop them, they'll get frustrated and upset. Their whole being is oriented to that marker and you're not letting them explore the way they want to.
But you sure do want to stop them, right? So what do?
Well, you give them something else. You take the marker away and offer them, say, a colorful whisk.
It's no different with dogs or cats, really. It's a pretty general thing. Attentional systems orient toward things. "Don't look here" is much harder than "Look here instead."
So if you notice a danger, it's important to acknowledge and address, but you also want to change orientation to the outcome you want.
I've been seriously concerned for the mental & emotional health of this community for a good while now. Its orientation, as far as I can tell, is to "not AI doom". Not a bright future. Not shared wholesomeness. Not healthy community. But "AI notkilleveryoneism".
I don't think you want to organize your creativity that way. Steering toward doom as an accidental result of focusing on it would be… really quite ironic and bad.
(And yes, I do believe we see evidence of exactly this pattern. Lots of people have noticed that quite a lot of AI risk mitigation efforts over the last two decades seem to have either (a) done nothing to timelines or (b) accelerated timelines. E.g. I think CFAR's main contribution to the space is arguably in its key role in inspiring Elon Musk to create OpenAI.)
My guess is most folk here would be happier if they picked a path they do want and aimed for that instead, now that they've spotted the danger they want to avoid. I bet we stand a much better chance of building a good future if we aim for one, as opposed to focusing entirely on not hitting the doom tree.
If we get to 2028 and there isn't yet such a shared vision, I think it'd be quite good to start talking about it. What future do we want to see? What might AI going well actually look like, for instance? Or what if AI stalls out for a long time, but we still end up with a wholesome future? What's that like? What might steps in that direction look like?
I think we need stuff like this to be whole, together.
In particular, I think faith in humanity as a whole needs to be thinkable.
Yes, most people are dumber than the average Lesswronger. Yes, stupidity has consequences that smart people can often foresee. Yes, maybe humanity is too dumb not to shoot itself in the foot with a bazooka.
But maybe we've got this.
Maybe we're all in this together, and on some level that matters, we all know it.
I'm not saying that definitely is the case. I'm saying it could be. And that possibility seems worth taking to heart.
I'm reminded of a time when I was talking with a "normie" facilitator at a Circling retreat. I think this was 2015. I was trying to explain how humanity seemed to be ignoring its real problems, and how I was at that retreat trying to become more effective at doing something about it all.
I don't remember his exact words, but the sentiment I remember was something like:
I don't understand everything you're saying. But you seem upset, man. Can I give you a hug?
I didn't think that mattered, but I like hugs, so I said yes.
And I started crying.
I think he was picking up on a level I just wasn't tracking. Sure, my ideas were sensible and well thought out. But underneath all that I was just upset. He noticed that undercurrent and spoke to and met that part, directly.
He didn't have any insights about how we might solve existential risk. I don't know if he even cared about understanding the problem. I didn't walk away being more efficient at creating good AI alignment researchers.
But I felt better, and met, and cared for, and connected.
I think that matters a lot.
I suspect there's a lot going on like this. That at least some of the historical mainstream shrugging around AI has been because there's some other level that also deeply matters that's of more central focus to "normies" than to rationalists.
I think it needs to be thinkable that the situation is not "AI risk community vs. army of ignorant normie NPCs". Instead it might be more like, there's one form of immense brilliance in spaces like Less Wrong. And what we're all doing, throughout the human race, is figuring out how to interface different forms of brilliance such that we can effectively care for what's in our shared interests. We're all doing it. It just looks really different across communities, because we're all attending to different things and therefore reach out to each other in very different ways. And that's actually a really good thing.
My guess is that it helps a lot when communities meet each other with an attitude of
We're same-sided here. We're in this together. That doesn't mean we yet know how to get along in each other's terms. But if it's important, we'll figure it out, even if "figure it out" doesn't look like what either of us expect at the start. We'll have to learn new ways of relating. But we can get there.
Come 2028, I hope Less Wrong can seriously consider for instance retiring terms like "NPC" and "normie", and instead adopt a more humble and cooperative attitude toward the rest of the human race. Maybe our fellow human beings care too. Maybe they're even paying vivid attention. It just might look different than what we're used to recognizing in ourselves and in those most like us.
And maybe also consider that even if we don't yet see how, and even if the transition is pretty rough at times, it all might turn out just fine. We don't know that it will. I don't mean to assert that it will. I mean, let's sincerely hold and attend to the possibility that it could. Maybe it'll all be okay.
I want to reiterate that I don't mean what's going on right now is wrong and needs to stop. Like I said, I preordered If Anyone Builds It, Everyone Dies. I don't personally feel the need to become more familiar with those arguments or to have new ones. And I'm skeptical about the overall approach. But it seems like a really good push within this strategy, and if it makes things turn out well, then I'd be super happy to be wrong here. I support the effort.
But we now have this plausible timeline spelled out. And by January 2028 we'll have a reasonably good sense of how much it got right, and wrong.
…with some complication. It's one of those predictions that interacts with what it's predicting. So if AI 2027 doesn't pan out, one could argue it's because it might have but it changed because the prediction went viral. And therefore we should keep pushing the same strategy as before, because maybe now it's finally working!
But I'm hoping for a few things here.
One is, maybe we can find a way to make these dire predictions less unfalsifiable. Not in general, but specifically AI 2027. What differences should we expect to see if (a) the predictions were distorted due to the trauma mechanism I describe in this post vs. (b) the act of making the predictions caused them not to come about? What other plausible outcomes are there come 2028, and what do we expect sensible updates to look like at that point?
Another hope I have is that the trauma projection thing can be considered seriously. Not necessarily acted on just yet. That could be distracting. But it's worth recognizing that if the trauma thing is really a dominant force in AI doomerism spaces, then when we get to January 2028 we might not have hit AI doom but it's going to seem like there are still lots of reasons to keep doing basically the same thing as before. How can we anticipate this reaction, distinguish it from other outcomes, and appropriately declare an HMC event if and when it happens?
So, this post is my attempt at kind of a collective emotional stop loss order.
I kind of hope it turns out to be moot. Because in the world where it's needed, that's yet another 2.5 years of terror and pain that we might have skipped if we could have been convinced a bit sooner.
But being convinced isn't an idle point. It matters that maybe nothing like what I'm naming in this post is going on. There needs to be a high-integrity way of checking what's true here first.
I'm hoping I've put forward a good compromise.
Let's discuss for now, and then check in about it in 31 months.