LESSWRONG
LW

AI TimelinesExistential riskCommunityAI
Frontpage

171

Consider chilling out in 2028

by Valentine
21st Jun 2025
16 min read
140

171

AI TimelinesExistential riskCommunityAI
Frontpage

171

Consider chilling out in 2028
131habryka
56jessicata
28habryka
24johnswentworth
4habryka
2Sam Iacono
2habryka
1Sam Iacono
1[comment deleted]
15Valentine
9Daniel Kokotajlo
2Raemon
7Daniel Kokotajlo
1garrets
6Daniel Kokotajlo
2garrets
2Daniel Kokotajlo
6Vaniver
11habryka
4Valentine
3casens
19Valentine
3ErickBall
1Random Developer
19ryan_greenblatt
11Vladimir_Nesov
11leogao
2Valentine
29Ben Pace
4davekasten
6ryan_greenblatt
0davekasten
10the gears to ascension
5ryan_greenblatt
5Rohin Shah
4Ben Pace
3leogao
2Ben Pace
7leogao
6Valentine
11leogao
6Eli Tyre
5Raemon
3Random Developer
20Valentine
14philh
4philh
6Random Developer
4LGS
1Hastings
8LGS
1Ben Livengood
3Random Developer
1LGS
1Ben Livengood
54johnswentworth
6Valentine
24TsviBT
47Vladimir_Nesov
3sanyer
2Vladimir_Nesov
2Valentine
32Kaj_Sotala
6Valentine
26Nate Showell
9Viliam
15Thane Ruthenis
4Valentine
4Thane Ruthenis
13azergante
4Valentine
13AnthonyC
4Valentine
11S. Alex Bradt
1Valentine
1S. Alex Bradt
3Valentine
1S. Alex Bradt
2Jiro
1S. Alex Bradt
10Ben Pace
2Valentine
1Michael Roe
8TsviBT
2Valentine
7Viliam
7DaemonicSigil
2Valentine
1roha
5Michael Roe
4sarahconstantin
1Michael Roe
13ouguoc
4Jiro
4Michael Roe
1Michael Roe
4Eli Tyre
9Eli Tyre
4Chris_Leong
2Valentine
23sarahconstantin
4Valentine
6Vladimir_Nesov
2Valentine
2Vladimir_Nesov
2Chris_Leong
2Valentine
2Chris_Leong
3Yonatan Cale
3Max Niederman
2Valentine
1Max Niederman
7Valentine
5Jiro
1Max Niederman
2Michael Roe
1adamsky
1Michael Roe
1Michael Roe
1Michael Roe
1Michael Roe
2Valentine
0Knight Lee
9Valentine
1Knight Lee
2Knight Lee
5philh
-2roha
3roha
6Noosphere89
8roha
5Valentine
5Valentine
-3arisAlexis
-4arisAlexis
2Valentine
-3arisAlexis
2Valentine
1arisAlexis
-5lumpenspace
1Valentine
2[comment deleted]
New Comment
140 comments, sorted by
top scoring
Click to highlight new comments since: Today at 11:24 PM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings
[-]habryka16d*13178

This feels kind of backwards, in the sense that I think something like 2032-2037 is probably the period that most people I know who have reasonably short timelines consider most likely. 

AI 2027 is a particularly aggressive timeline compared to the median, so if you choose 2028 as some kind of Schelling time to decide whether things are markedly slower than expected then I think you are deciding on a strategy that doesn't make sense by like 80% of the registered predictions that people have.

Even the AI Futures team themselves have timelines that put more probability mass on 2029 than 2027, IIRC. 

Of course, I agree that in some worlds AI progress has substantially slowed down, and we have received evidence that things will take longer, but "are we alive and are things still OK in 2028?" is a terrible way to operationalize that. Most people do not expect anything particularly terrible to have happened by 2028!

My best guess, though I am far from confident, is that things will mostly get continuously more crunch-like from here, as things continue to accelerate. The key decision-point in my model at which things might become a bit different is if we hit the end of the compute o... (read more)

Reply1
[-]jessicata16d5630

Situational Awareness and AI 2027 have been signal boosted to normal people more than other predictions, though. Like, they both have their own website, AI 2027 has a bunch of fancy client-side animation and Scott Alexander collaborated, and someone made a Youtube video on AI 2027.

While AI safety people have criticized the timeline predictions to some extent, there hasn't been much in-depth criticism (aside from the recent very long post on AI 2027), the general sentiment on their timelines seems positive (although Situational Awareness has been criticized for contributing to arms race dynamics).

I get that someone who looks at AI safety people's timelines in more detail would get a different impression. Though, notably, Metaculus lists Jan 2027 as a "community prediction" of "weakly general AI". Sure, someone could argue that weakly general AI doesn't imply human-level AGI soon after, but mostly when I see AI safety people point to this Metaculus market, it's as evidence that experts believe human-level AGI will arrive in the next few years, there is not emphasis on the delta between weakly general AI and human-level AGI.

So I see how an outsider would see more 2027-2029 timelines f... (read more)

Reply2
[-]habryka16d2815

Yep, agree that there is currently a biased coverage towards very short timelines. I think this makes sense in that the worlds where things are happening very soon are the worlds that from the perspective of a reasonable humanity require action now.[1]

I think despite the reasonable justification for focusing on the shorter timelines worlds for decision-making reasons, I do expect this to overall cause a bunch of people to walk away with the impression that people confidently predicted short timelines, and this in turn will cause a bunch of social conflict and unfortunate social accounting to happen in most worlds. 

I on the margin would be excited to collaborate with people who would want to do similar things to AI 2027 or Situational Awareness for longer timelines.

  1. ^

    I.e. in as much as you model the government as making reasonable risk-tradeoffs in the future, the short timeline worlds are the ones that require intervention to cause changes in decision-making now.

    I am personally more pessimistic about humanity doing reasonable things, and think we might just want to grieve over short timeline worlds, but I sure don't feel comfortable telling other people to not ring the ala

... (read more)
Reply
[-]johnswentworth15d2413

Even if it does make sense strategically to put more attention on shorter timelines, that sure does not seem to be what actually drives the memetic advantage of short forecasts over long forecasts. If you want your attention to be steered in strategically-reasonable ways, you should probably first fully discount for the apparent memetic biases, and then go back and decide how much is reasonable to re-upweight short forecasts. Whatever bias the memetic advantage yields is unlikely to be the right bias, or even the right order of magnitude of relative attention bias.

Reply1
4habryka15d
I mean, I am not even sure it's strategic given my other beliefs, and I was indeed saying that on the margin more longer-timeline coverage is worth it, so I think we agree.
2Sam Iacono15d
What’s the longest timeline that you could still consider a short timeline by your own metric, and therefore a world “we might just want to grieve over”? I ask because, in your original comment you mentioned 2037 as a reasonably short timeline, and personally if we had an extra decade I’d be a lot less worried.
2habryka15d
About 15 years, I think?  Edit: Oops, I responded to the first part of your question, not the second. My guess is timelines with less than 5 years seem really very hard, though we should still try. I think there is lots of hope in the 5-15 year timeline worlds. 15 years is just roughly the threshold of when I would stop considering someone's timelines "short", as a category.
1Sam Iacono14d
I admit, it’s pretty disheartening to hear that, even if we had until 2040 (which seems less and less likely to me anyway), you’d still think there’s not much we could do but grieve in advance.
1[comment deleted]14d
[-]Valentine15d155

…people who were worried because of listening to this algorithm should chill out and re-evaluate.

And communication strategies based on appealing to such people's reliance on those algorithms should also re-evaluate.

E.g., why did folk write AI 2027? Did they honestly think the timeline was that short? Were they trying to convey a picture that would scare people with something on a short enough timeline that they could feel it?

If the latter, we might be doing humanity a disservice, both by exhausting people from something akin to adrenal fatigue, and also as a result of crying wolf.

Reply2
9Daniel Kokotajlo13d
Yes, I honestly thought the timeline was that short. I now think it's 50% by end of 2028; over the last year my timelines have lengthened by about a year.
2Raemon13d
Well extrapolating that it sounds like things are fine. :P
7Daniel Kokotajlo13d
It has indeed been really nice, psychologically, to have timelines that are lengthening again. 2020 to 2024 that was not the case.
1garrets13d
You wrote AI 2027 in April... what changed in such a short amount of time? If your timelines lengthened over the last year, do you think writing AI 2027 was an honest reflection of your opinions at the time?
6Daniel Kokotajlo11d
The draft of AI 2027 was done in December, then we had months of editing and rewriting in response to feedback. For more on what changed, see various comments I made online such as this one: https://www.lesswrong.com/posts/cxuzALcmucCndYv4a/daniel-kokotajlo-s-shortform?commentId=dq6bpAHeu5Cbbiuyd  We said right on the front page of AI 2027 in a footnote that our actual AGI timelines medians were somewhat longer than 2027:  I also mentioned my slightly longer timelines in various interviews about it, including the first one with Kevin Roose.
2garrets7d
OpenAI researcher Jason Wei recently stated that there will be many bottlenecks to recursive self improvement (experiments, data), thoughts? https://x.com/_jasonwei/status/1939762496757539297z
2Daniel Kokotajlo6d
He makes some obvious points everyone already knows about bottlenecks etc. but then doesn't explain why all that adds up to a decade or more, instead of of a year, or a month, or a century. In our takeoff speeds forecast we try to give a quantitative estimate that takes into account all the bottlenecks etc.
6Vaniver15d
Isn't it more like "I think there's a 10% chance of transformative AI by 2027, and that is like 100x higher than what it looks like most people think, so people really need to think thru that timeline"? Like, I generally put my median year at 2030-2032; if we make it to 2028, the situation will still feel like "oh jeez we probably only have a few years left", unless we made it to 2028 thru a mechanism that clearly blocks transformative AI showing up in 2032. (Like, a lot is hinging on what "feels basically like today" means.)
[-]habryka15d115

I think Daniel also just has shorter timelines than most (which is correlated for wanting to more urgently communicate that knowledge).

Reply
4Valentine14d
That might be. It sounds really plausible. I don't know why they wrote it! But all the same: I don't think most people know what 10% likelihood of a severe outcome is like or how to think about it sensibly. My read is that the vast majority of people need to treat 10% likelihood of doom as either "It's not going to happen" (because 10% is small) or "It's guaranteed to happen" (because it's a serious outcome if it does happen, and it's plausible). So, amplifying the public awareness of this possibility seems more to me like moving awareness of the scenario from "Nothing existential is going to happen" to "This specific thing is the default thing to expect." So I expect that unless something is done to… I don't know, magically educate the population on statistical thinking, or propagate a public message that it's roughly right but its timeline is wrong? then the net effect will be that either (a) AI 2027 will have been collectively forgotten by 2028 in roughly the same way that, say, Trudeau's use of the Emergencies Act has been forgotten; or (b) the predictions failing to pan out will be used as reason to dismiss other AI doom predictions that are apparently considered more likely. The main benefit I see is if some key folk are made to think about AI doom scenarios in general as a result of AI 2027, and start to work out how to deal with other scenarios. But I don't know. That's been part of this community's strategy for over two decades. Get key people thinking about AI risk. And I'm not too keen on the results I've seen from that strategy so far.
3casens15d
it does imply that, but i'm somewhat loathe to mention this at all, because i think the predictive quality you get from one question to another varies astronomically, and this is not something the casual reader will be able to glean
[-]Valentine15d191

I think something like 2032-2037 is probably the period that most people I know who have reasonably short timelines consider most likely.

I honestly didn't know that. Thank you for mentioning it. Almost everything I hear is people worrying about AGI in the next few years, not AGI a decade from now.

if you choose 2028 as some kind of Schelling time to decide whether things are markedly slower than expected then I think you are deciding on a strategy that doesn't make sense by like 80% of the registered predictions that people have.

Just to check: you're saying that by 2028 something like 80% of the registered predictions still won't have relevant evidence against them?

As both a pragmatic matter and a moral one, I really hope we can find ways of making more of those predictions more falsifiable sooner. If anything in the vague family of what I'm saying is right, but if folk won't halt, melt, & catch fire for at least a decade, then that's an awful lot of pointless suffering and wasted talent. I'm also skeptical that there'll actually be a real HMC in a decade; if the metacognitive blindspot type I'm pointing at is active, then waiting a decade is part of the strategy for not having ... (read more)

Reply
3ErickBall12d
As you implied above, pessimism is driven only secondarily by timelines. If things in 2028 don't look much different than they do now, that's evidence for longer timelines (maybe a little longer, maybe a lot). But it's inherently not much evidence about how dangerous superintelligence will be when it does arrive. If the situation is basically the same, then our state of knowledge is basically the same. So what would be good evidence that worrying about alignment was unnecessary? The obvious one is if we get superintelligence and nothing very bad happens, despite the alignment problem remaining unsolved. But that's like pulling the trigger to see if the gun is loaded. Prior to superintelligence, personally I'd be more optimistic if we saw AI progress requiring even more increasing compute than the current trend--if the first superintelligences were very reliant on massive pools of tightly integrated compute, and had very limited inference capacity, that would make us less vulnerable and give us more time to adapt to them. Also, if we saw a slowdown in algorithmic progress despite widespread deployment of increasingly capable coding software, that would be a very encouraging sign that recursive self-improvement might happen slowly.
1Random Developer14d
Very little. I've been seriously thinking about ASI since the early 00s. Around 2004-2007, I put my timeline around 2035-2045, depending on the rate of GPU advancements. Given how hardware and LLM progress actually played out, my timeline is currently around 2035. I do expect LLMs (as we know them now) to stall before 2028, if they haven't already. Something is missing. I have very concrete guesses as to what is missing, and it's an area of active research. But I also expect the missing piece adds less than a single power of 10 to existing training and inference costs. So once someone publishes it in any kind of convincing way, then I'd estimate better than an 80% chance of uncontrolled ASI within 10 years. Now, there are lots of things I could see in 2035 that would cause me to update away from this scenario. I did, in fact, update away from my 2004-2007 predictions by 2018 or so, largely because nothing like ChatGPT 3.5 existed by that point. GPT 3 made me nervous again, and 3.5 Instruct caused me to update all the way back to my original timeline. And if we're still stalled in 2035, then sure, I'll update heavily away from ASI again. But I'm already predicting the LLM S-curve to flatten out around now, resulting in less investment in Chinchilla scaling and more investment in algorithmic improvement. But since algorithmic improvement is (1) hard to predict, and (2) where I think the actual danger lies, I don't intend to make any near-term updates away ASI.
[-]ryan_greenblatt16d1912

The key decision-point in my model at which things might become a bit different is if we hit the end of the compute overhang, and you can't scale up AI further simply by more financial investment, but instead now need to substantially ramp up global compute production, and make algorithmic progress, which might markedly slow down progress.

I think compute scaling will slow substantially by around 2030 (edit: if we haven't seen transformative AI). (There is some lag, so I expect the rate at which capex is annually increasing to already have slowed by mid 2028 or so, but this will take a bit before it hits scaling.)

Also, it's worth noting that most algorithmic progress AI companies are making is driven by scaling up compute (because scaling up labor in an effective way is so hard: talented labor is limited, humans parallelize poorly, and you can't pay more to make them run faster). So, I expect algorithmic progress will also slow around this point.

All these factors make me think that something like 2032 or maybe 2034 could be a reasonable Schelling time (I agree that 2028 is a bad Schelling time), but IDK if I see that much value in having a Schelling time (I think you probably agree with this).

In practice, we should be making large updates (in expectation) over the next 5 years regardless.

Reply1
[-]Vladimir_Nesov15d111

I think compute scaling will slow substantially by around 2030

There will be signs if it slows down earlier, it's possible that in 2027-2028 we are already observing that there is no resolve to start building 5 GW Rubin Ultra training systems (let alone the less efficient but available-a-year-earlier 5 GW non-Ultra Rubin systems), so that we can update then already, without waiting for 2030.

This could result from some combination of underwhelming algorithmic progress, RLVR scaling not working out, and the 10x compute scaling from 100K H100 chips to 400K GB200 chips not particularly helping, so that AIs of 2027 fail to be substantially more capable than AIs of 2025.

But sure, this doesn't seem particularly likely. And there will be even earlier signs that the scaling slowdown isn't happening before 2027-2028 if the revenues of companies like OpenAI and Anthropic keep sufficiently growing (in 2025-2026), though most of these revenues might also be indirectly investment-fueled, threatening to evaporate if AI stops improving substantially.

Reply1
[-]leogao12d112

my synthesis is I think people should chill out more today and sprint harder as the end gets near (ofc, some fraction of people should always be sprinting as if the end is near, as a hedge, but I think it should be less than now. also, if you believe the end really is <2 years away then disregard this). the burnout thing is real and it's a big reason for me deciding to be more chill. and there's definitely some weird fear driven action / negative vision thing going on. but also, sprinting now and chilling in 2028 seems like exactly the wrong policy 

Reply1
2Valentine12d
I agree. To be honest I didn't think chilling out now was a real option. I hoped to encourage it in a few years with the aid of preregistration.
[-]Ben Pace12d2920

As a datapoint, none of this chilling out or sprinting hard discussion resonates with me. Internally I feel that I've been going about as hard as I know how to since around 2015, when I seriously got started on my own projects. I think I would be working about similarly hard if my timelines shortened by 5 years or lengthened by 15. I am doing what I want to do, I'm doing the best I can, and I'm mostly focusing on investing my life into building truth-seeking and world-saving infrastructure. I'm fixing all my psychological and social problems insofar as they're causing friction to my wants and intentions, and as a result I'm able to go much harder today than I was in 2015. I don't think effort is really a substantially varying factor in how good my output is or impact on the world. My mood/attitude is not especially dour and I'm not pouring blind hope into things I secretly know are dead ends. Sometimes I've been more depressed or had more burnout, but it's not been much to do with timelines and more about the local environment I've been working in or internal psychological mistakes. To be clear, I try to take as little vacation time at work as I psychologically can (like 2-4 weeks ... (read more)

Reply
4davekasten11d
I would strongly, strongly argue that essentially "take all your vacation" is a strategy that would lead to more impact for you on your goals, almost regardless of what they are. Humans need rest, and humans like the folks on LW tend not to take enough.
6ryan_greenblatt11d
Naively, working more will lead to more output and if someone thinks they feel good while working a lot, I think the default guess should be that working more is improving their output. I would be interested in the evidence you have for the claim that people operating similar to Ben described should take more vacation. I think there is some minimum amount of breaks and vacation that people should strongly default to taking and it also seems good to take some non-trivial amount of time to at least reflect on their situation and goals in different environments (you can think of this as a break, or as a retreat). But, 2-4 weeks per year of vacation combined with working more like 70 hours a week seems like a non-crazy default if it feels good. This is only working around 2/3 of waking hours (supposing 9 hours for sleep and getting ready for sleep) and working ~95% of weeks. (And Ben said he works 50-70 hours, not always 70.) It's worth noting that "human perform better with more rest" isn't a sufficient argument for thinking more rest is impactful: you need to argue this effect overwhelms the upsides of additional work. (Including things like returns to being particularly fast and possible returns to scale on working hours.)
0davekasten9d
I mean, two points: 1.  We all work too many hours, working 70 hours a week persistently is definitely too many to maximize output.  You get dumb fast after hour 40 and dive into negative productivity.  There's a robust organizational psych literature on this, I'm given to understand, that we all choose to ignore, because the first ~12 weeks or so, you can push beyond and get more done, but then it backfires. 2.  You're literally saying statements that I used to say before burning out, and that the average consultant or banker says as part of their path to burnout. And we cannot afford to lose either of you to burnout, especially not right now. If you're taking a full 4 weeks, great.  2 weeks a year is definitely not enough at a 70 hours a week pace, based on the observed long term health patterns of everyone I've known who works that pace for a long time.  I'm willing to assert that you working 48/50ths of the hours a year you'd work otherwise is worth it, assuming fairly trivial speedups in productivity of literally just over 4% from being more refreshed, getting new perspectives from downing tools, etc.
[-]the gears to ascension9d108

Burnout is not a result of working a lot, it's a result of work not feeling like it pays out in ape-enjoyableness[citation needed]. So they very well could be having a grand ol time working a lot if their attitude towards intended amount of success matches up comfortably with actual success and they find this to pay out in a felt currency which is directly satisfying. I get burned out when effort => results => natural rewards gets broken, eg because of being unable to succeed at something hard, or forgetting to use money to buy things my body would like to be paid with.

Reply1
5ryan_greenblatt9d
If someone did a detailed literature review or had relatively serious evidence, I'd be interested. By default, I'm quite skeptical of your level of confidence in this claims given that they directly contradict my experience and the experience of people I know. (E.g., I've done similar things for way longer than 12 weeks.) To be clear, I think I currently work more like 60 hours a week depending on how you do the accounting, I was just defending 70 hours as reasonable and I think it makes sense to work up to this.
5Rohin Shah9d
I think the evidence is roughly at "this should be a weakly held prior easily overturned by personal experience": https://www.lesswrong.com/posts/c8EeJtqnsKyXdLtc5/how-long-can-people-usefully-work That said, I do think there's enough evidence that I would bet (not at extreme odds) that it is bad for productivity to have organizational cultures that emphasize working very long hours (say > 60 hours / week), unless you are putting in special care to hire people compatible with that culture. Partly this is because I expect organizations to often be unable to overcome weak priors even when faced with blatant evidence.
4Ben Pace11d
but most of my work is very meaningful and what i want to be doing i don't want to see paris or play the new zelda game more than i want to make lessonline happen
3leogao12d
i think there's a lot of variance. i personally can only work in unpredictable short intense bursts, during which i get my best work done; then i have to go and chill for a while. if i were 1 year away from the singularity i'd try to push myself past my normal limits and push chilling to a minimum, but doing so now seems like a bad idea. i'm currently trying to fix this more durably in the long run but this is highly nontrival
2Ben Pace12d
Oh that makes sense, thanks. That seems more like a thing for people who's work comes from internal inspiration / is more artistic, and also for people who have personal or psychological frictions that cause them to burn out a lot when they do this sort of burst-y work. I think a lot of my work is heavily pulled out of me be the rest of the world setting deadlines (e.g. users making demands, people arriving for an event, etc), and I can cause those sorts of projects to pull lots of work out of me more regularly. I also think I don't take that much damage from doing it.
7leogao12d
it still seems bad to advocate for the exactly wrong policy, especially one that doesn't make sense even if you turn out to be correct (as habryka points out in the original comment, many think 2028 is not really when most people expect agi to have happened). it seems very predictable that people will just (correctly) not listen to the advice, and in 2028 both sides on this issue will believe that their view has been vindicated - you will think of course rationalists will never change their minds and emotions on agi doom, and most rationalists will think obviously it was right not to follow the advice because they never expected agi to definitely happen before 2028. i think you would have much more luck advocating for chilling today and citing past evidence to make your case..
6Valentine7d
I'm super sensitive to framing effects. I notice one here. I could be wrong, and I'm guessing that even if I'm right you didn't intend it. But I want to push back against it here anyway. Framing effects don't have to be intentional! It's not that I started with what I thought was a wrong or bad policy and tried to advocate for it. It's that given all the constraints, I thought that preregistering a possibility as a "pause and reconsider" moment might be the most effective and respectful. It's not what I'd have preferred if things were different. But things aren't different from how they are, so I made a guess about the best compromise. I then learned that I'd made some assumptions that weren't right, and that determining such a pause point that would have collective weight is much more tricky. Alas. But it was Oliver's comment that brought this problem to my awareness. At no point did I advocate for what I thought at the time was the wrong policy. I had hope because I thought folk were laying down some timeline predictions that could be falsified soon. Turns out, approximately nope.   Empirically I disagree. That demonstrably has not been within the reach of my skill to do effectively. But it's a sensible thing to consider trying again sometime.
[-]leogao7d119

to be clear, I am not intending to claim that you wrote this post believing that it was wrong. I believe that you are trying your best to improve the epistemics and I commend the effort. 

I had interpreted your third sentence as still defending the policy of the post even despite now agreeing with Oliver, but I understand now that this is not what you meant, and that you are no longer in favor of the policy advocated in the post. my apologies for the misunderstanding.

I don't think you should just declare that people's beliefs are unfalsifiable. certainly some people's views will be. but finding a crux is always difficult and imo should be done through high bandwidth talking to many people directly to understand their views first (in every group of people, especially one that encourages free thinking among its members, there will be a great diversity of views!). it is not effective to put people on blast publicly and then backtrack when people push back saying you misunderstood their position.

I realize this would be a lot of work to ask of you. unfortunately, coordination is hard. it's one of the hardest things in the world. I don't think you have any moral obligation to do this... (read more)

Reply
6Eli Tyre13d
Sure, but to the extent that we put probability mass on AGI as early as 2027, we correspondingly should update from not having seen it, and especially not having seen the precursors we expect to see, by then. I haven't seen an AI produce a groundbreaking STEM paper by 2027, my probability LLMs + RL will scale to superintelligence, drops from about 80% to about 70%.
5Raemon12d
Not quite responding t your main point, but: I think on one hand, things will totally get more crunchlike as time goes on, but also, I think "working hard now" is more leveraged than "working hard later" because now is when the world is generally waking up and orienting, the longer you wait the more entrenched powers-that-be will be dominating the space and controlling the narrative. I actually was planning to write a post that was sort-of-the-complement of Val's, i.e. "Consider 'acting as if short timelines' for the next year or so, then re-evaluate" (I don't think you need to wait till 2028 to see how prescient the 2027 models are looking) I'm not sure what to think in terms of "how much to chill out" which is probably a wrong question. I think if there was a realistic option to "work harder and burn out more afterwards" this year, I would, but I sort of tried that and then immediately burned out more than seemed useful even as a shortterm tradeoff.
3Random Developer14d
Yup. I think we're missing ~1 key breakthrough, followed by a bunch of smaller tweaks, before we actually hit AGI. But I also suspect that that road from AGI to ASI is very short, and that the notion of "aligned" ASI is straight-up copium. So if an ASI ever arrives, we'll get whatever future the ASI chooses. In other words, I believe that: * LLMs alone won't quite get us to AGI. * But there exists a single, clever insight which would close at least half the remaining distance to AGI. * That insight is likely a "recipe for ruin", in the sense that once published, it can't be meaningfully controlled. The necessary training steps could be carried out in secret by many organizations, and a weak AGI might be able to run on a 2028 Mac Studio. (No, I will not argue for the above points. I have a few specific candidates for the ~1 breakthrough between us and AGI, and yes, those candidates being very actively researched by serious people.) But this makes it hard for me to build an AGI timeline. It's possible someone has already had the key insight, and that they're training a weak, broken AGI even as we speak. And it's possible that as soon as they publish, the big labs will know enough to start training runs for a real AGI. But it's also possible that we're waiting on a theoretical breakthrough. And breakthroughs take time. So I am... resigned. Que séra, séra. I won't do capabilities work. I will try to explain to people that if we ever build an ASI, the ASI will very likely be the one making all the important decisions. But I won't fool myself into thinking that "alignment" means anything more than "trying to build a slightly kinder pet owner for the human race." Which is, you know, a worthy goal! If we're going to lose control over everything, better to lose control to something that's more-or-less favorably disposed. I do agree that 2028 is a weird time to stop sounding the alarm. If I had to guess, 2026-2028 might be years of peak optimism, when things still lo
[-]Valentine14d2014

there exists a single, clever insight which would close at least half the remaining distance to AGI

By my recollection, this specific possibility (and neighboring ones, like "two key insights" or whatever) has been one of the major drivers of existential fear in this community for at least as long as I've been part of it. I think Eliezer expressed something similar. Something like "For all we know, we're just one clever idea away from AGI, and some guy in his basement will think of it and build it. That could happen at any time."

I don't know your reasons for thinking we're just one insight away, and you explicitly say you don't want to present the arguments here. Which makes sense to me!

I just want to note that from where I'm standing, this kind of thinking and communicating sure looks like a possible example of the type of communication pattern I'm talking about in the OP. I'm actually not picky about the trauma model specifically. But it totally fits the bill of "Models spread based significantly on how much doom they seem to plausibly forecast." Which makes some sense if there really is a severe doom you're trying to forecast! But it also puts a weird evolutionary incentive on th... (read more)

Reply
[-]philh14d1418

and neighboring ones, like “two key insights” or whatever

I... kinda feel like there's been one key insight since you were in the community? Specifically I'm thinking of transformers, or whatever it is that got us from pre-GPT era to GPT era.

Depending on what counts as "key" of course. My impression is there's been significant algorithmic improvements since then but not on the same scale. To be fair it sounds like Random Developer has a lower threshold than I took the phrase to mean.

But I do think someone guessing "two key insights away from AGI" in say 2010, and now guessing "one key insight away from AGI", might just have been right then and be right now?

(I'm aware that you're not saying they're not, but it seemed worth noting.)

Reply111
4philh11d
(Re the "missed the point" reaction, I claim that it's not so much that I missed the point as that I wasn't aiming for the point. But I recognize that reactions aren't able to draw distinctions that finely.)
6Random Developer14d
I work with LLMs professionally, and my job currently depends on accurate capabilities evaluation. To give you an idea of the scale, I sometimes run a quarter million LLM requests a day. Which isn't that much, but it's something. A year ago, I would have vaguely guesstimated that we were about "4-5 breakthroughs" away. But those were mostly unknown breakthroughs. One of those breakthroughs actually occurred (reasoning models and mostly coherent handling of multistep tasks). But I've spent a lot of time since then experimenting with reasoning models, running benchmarks, and reading papers. When I predict that "~1 breakthrough might close half the remaining distance to AGI," I now have something much more specific in mind. There are multiple research groups working hard on it, including at least one frontier lab. I could sketch out a concrete research plan and argue in fairly specific detail why this is the right place to look for a breakthrough. I have written down very specific predictions (and stored them somewhere safe), just to keep myself honest. If I thought getting close to AGI was a good thing, then I believe in this idea enough to spend, oh, US$20k out of pocket renting GPUs. I'll accept that I'm likely wrong on the details, but I think I have a decent chance of being in the ballpark. I could at least fail interestingly enough to get a job offer somewhere with real resources. But I strongly suspect that AGI leads almost inevitably to ASI, and to loss of human control over our futures. Good. I am walking a very fine line here. I am trying to be just credible and specific enough to encourage a few smart people to stop poking the demon core quite so enthusiastically, but not so specific and credible that I make anyone say, "Oh, that might work! I wonder if anyone working on that is hiring." I am painfully aware that OpenAI was founded to prevent a loss of human control, and that it has arguably done more than any other human organization to cause what it
4LGS13d
I don't appreciate the local discourse norm of "let's not mention the scary ideas but rest assured they're very very scary". It's not healthy. If you explained the idea, we could shoot it down! But if it's scary and hidden then we can't. Also, multiple frontier labs are currently working on it and you think your lesswrong comment is going to make a difference? You should at least say by when you will consider this specific single breakthrough thing to be falsified.
1Hastings13d
The universe isn't obligated to cooperate with our ideals for discourse norms.
8LGS13d
Exactly The universe doesn't care if you try to hide your oh so secret insights; multiple frontier labs are working on those insights The only people who care are the people here getting more doomy and having worse norms for conversations.
1Ben Livengood13d
There's quite a difference between a couple frontier labs achieving AGI internally and the whole internet being able to achieve AGI on a llama/deepseek base model, for example.
3Random Developer13d
One of my key concerns is the question of: 1. Do the currently missing LLM abilities scale like pre-training, where each improvement requires spending 10x as much money? 2. Or do the currently missing abilities scale more like "reasoning", where individual university groups could fine-tune an existing model for under $5,000 in GPU costs, and give it significant new abilities? 3. Or is the real situation somewhere in between? Category (2) is what Bolstrom described as a "vulnerable world", or a "recipe for ruin." Also, not everyone believes that "alignment" will actually work for ASI. Under these assumptions, widely publishing detailed proposals in category (2) would seem unwise? Also, even I believed that someone would figure out the necessary insights to build AGI, it still matters how quickly they do it. Given a choice of dying of cancer in 6 months or 12 (all other things being equal), I would pick 12. (I really ought to make an actual discussion post on the right way to handle even "recipes for small-scale ruin." After September 11th, this was a regular discussion among engineers and STEM types. It turns out that there are some truly nasty vulnerabilities that are known to experts, but that are not widely known to the public. If these vulnerabilities can be fixed, it's usually better to publicize them. But what should you do if a vulnerability is fundamentally unfixable?)
1LGS13d
Exactly! The frontier labs have the compute and incentive to push capabilities forward, while randos on lesswrong are instead more likely to study alignment in weak open source models
1Ben Livengood12d
I think that we have both the bitter lesson that transformers will continue to gain capabilities with scale and also that there are optimizations that will apply to intelligent models generally and orthogonally to computing scale.  The latter details seem dangerous to publicize widely in case we happen to be in the world of a hardware overhang allowing AGI or RSI (which I think could be achieved easier/sooner by a "narrower" coding agent and then leading rapidly to AGI) on smaller-than-datacenter clusters of machines today.
[-]johnswentworth16d5426

I want to acknowledge that if this isn't at all what's going on in spaces like Less Wrong, it might be hard to demonstrate that fact conclusively. So if you're really quite sure that the AI problem is basically as you think it is, and that you're not meaningfully confused about it, then it makes a lot of sense to ignore this whole consideration as a hard-to-falsify distraction.

"Conclusively" is usually the bar for evidence usable to whack people over the head when they're really determined to not see reality. If one is in fact interested in seeing reality, there's already plenty of evidence relevant to the post's model.

One example: forecasting AI timelines is an activity which is both strategically relevant to actual AI risk, and emotionally relevant to people drawn to doom stories. But there's a large quantitative difference in how salient or central timelines are, strategically vs emotionally, relative to other subtopics. In particular: I claim that timelines are much more relatively salient/central emotionally than they are strategically. So, insofar as we see people focused on timelines out of proportion to their strategic relevance (relative to other subtopics), that lends sup... (read more)

Reply
6Valentine15d
I like your analysis. I haven't thought deeply about the particulars, but I agree that we should be able to observe evidence one way or the other right now. I've just found it prohibitively difficult (probably due to my own lack of skill) to encourage honestly looking at the present evidence in this particular case. So I was hoping that we could set some predictions and then revisit them, to bypass the metacognitive blindspot effect.   That's my gut impression too. And part of what I care about here is, if the proportion of such people is large enough, the social dynamics of what seems salient will be shaped by these emotional mechanisms, and that effect will masquerade as objectivity. Social impacts on epistemology are way stronger than I think most people realize or account for.
[-]TsviBT15d2411

Obvious enough but worth saying explicitly: It's not just impacts on epistemology, but also collective behavior. From https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#My_views_on_strategy :

Suppose there is a consensus belief, and suppose that it's totally correct. If funders, and more generally anyone who can make stuff happen (e.g. builders and thinkers), use this totally correct consensus belief to make local decisions about where to allocate resources, and they don't check the global margin, then they will in aggregrate follow a portfolio of strategies that is incorrect. The make-stuff-happeners will each make happen the top few things on their list, and leave the rest undone. The top few things will be what the consensus says is most important——in our case, projects that help if AGI comes within 10 years. If a project helps in 30 years, but not 10 years, then it doesn't get any funding at all. This is not the right global portfolio; it oversaturates fast interventions and leaves slow interventions undone.

Reply
[-]Vladimir_Nesov16d4712

There are two specific shoes that are yet to drop, and only one of them is legibly timed, AI-2027 simply takes the premise that both fall in 2027. There's funding-fueled frontier AI training system scaling that will run out (absent AGI) in 2027-2029, which at the model level will be felt by 2028-2030. And there's test-time adaptation/learning/training that accrues experience/on-boarding/memory of a given AI instance in the form of weight updates rather than observable-by-the-model artifacts outside the model.

So the currently-much-faster funding-fueled scaling has an expiration date, and if it doesn't push capabilities past critical thresholds, then absence of this event is the reason to expect longer timelines. But test-time learning first needs to actually happen to be observed as by itself non-transformative, and as a basic research puzzle it might take an unknown number of years (again, absent AGI from scaling without test-time learning). If everyone is currently working on it and it fails to materialize by 2028, that's also some sort of indication that it might take nontrivial time.

Reply1
3sanyer12d
How many people are working on test-time learning? How feasible do you think it is?
2Vladimir_Nesov11d
From a new comment elsewhere:
2Valentine15d
This was a really helpful overview. Thank you.
[-]Kaj_Sotala16d3224

This post reads to me as a significantly improved version of the earlier "Here's the exit", in that this one makes a similar point but seems to have taken the criticisms of that post to heart and seems a lot more epistemically reasonable. And I want to just explicitly name that and say that I appreciate it.

(Also I agree with a lot in the post, though some of the points raised in the responses about 2028 maybe being a poor choice for this also sound valid.)

Reply
6Valentine15d
Thanks for noticing and mentioning it. I'm trying. It's nice to have some "Yes, this" type feedback. It's helpful (and also touching).
[-]Nate Showell16d2615

There's another way in which pessimism can be used as a coping mechanism: it can be an excuse to avoid addressing personal-scale problems. A belief that one is doomed to fail, or that the world is inexorably getting worse, can be used as an excuse to give up, on the grounds that comparatively small-scale problems will be swamped by uncontrollable societal forces. Compared to confronting those personal-scale problems, giving up can seem very appealing, and a comparison to a large-scale but abstract problem can act as an excuse for surrender. You probably know someone who spends substantial amounts of their free time watching videos, reading articles, and listening to podcasts that blame all of the world's problems on "capitalism," "systemic racism," "civilizational decline," or something similar, all while their bills are overdue and dishes pile up in their sink.

 

This use of pessimism as a coping mechanism is especially pronounced in the case of apocalypticism. If the world is about to end, every other problem becomes much less relevant in comparison, including all those small-scale problems that are actionable but unpleasant to work on. Apocalypticism can become a blanket pret... (read more)

Reply21
9Viliam15d
That includes "raising the sanity waterline", by the way. We used to talk about it, long time ago.
[-]Thane Ruthenis15d154

There's some sound logic to what folk are saying. I think there's a real concern. But the desperate tone strikes me as… something else. Like folk are excited and transfixed by the horror.

For what it's worth, I think pre-2030 AGI Doom is pretty plausible[1], and this doesn't describe my internal experience at all. My internal experience is of wanting to chill out and naturally drifting in that direction, only to be periodically interrupted by OpenAI doing something attention-grabbing or another bogus "innovator LLMs!" paper being released, forcing me to go look if I were wrong to be optimistic after all. That can be stressful, whenever that happens, but I'm not really feeling particularly doom-y or scared by default. (Indeed, I think fear is a bad type of motivation anyway. Hard agree with your "shared positive vision" section.)

I'm not sure whether that's supposed to be compatible with your model of preverbal trauma...?

I would also like to figure out some observable which we'd be able to use to distinguish between "doom incoming" and "no doom yet" as soon as possible. But unfortunately, most observables that look like they fit would also be bogus. Vladimir Nesov's analysis is the be... (read more)

Reply11
4Valentine14d
Sure! It sounds like maybe it's not what's going on for you personally. And I still wonder whether it's what's going on for a large enough subset (maybe a majority?) of folk who are reacting to AI risk. Which someone without a trauma slot for it might experience as something like: * "Gosh, they really seem kind of freaked out. I mean, sure, it's bad, but what's all this extra for?" * "Why are they fleshing out these scary details over here so much? How does that help anything?" * "Well sure, I'm having kids. Yeah, I don't know what's going to happen. That's okay." * "I'm taking up a new hobby. Why? Oh, because it's interesting. What? Uh, no, I'm not concerned about how it relates to AI." I'm making that up off the top of my head based on some loose models. I'm more trying to convey a feel than give a ton of logical details. Not that they can't be fleshed out; I just haven't done so yet.
4Thane Ruthenis14d
I do feel like this sometimes, yeah. Particularly when OpenAI does something attention-grabbing and there's a Twitter freakout about it.
[-]azergante14d132

How about we learn to smile while saving the world? Saving the world doesn't strike me as strictly incompatible with having fun, so let's do both? :)

The post proposes that LessWrong could tackle alignment in more skillfull ways which is a wholesome thought, but I feel that the post also casts doubt on the project of alignment itself ; I want to push back on that.

It won't become less important to prevent the creation of harmful technologies in 2028, or in any year for that matter. Timelines and predictions don't feel super relevant here.

We know that AGI can be dangerous if created without proper understanding and that fact does not change with time or timelines, so LW should still aim for:

  1. An international framework that restricts AGI creation and ensures safety, just like for other large impact technologies
  2. Alignment research to eventually reap the benefits of aligned AGI, but with less pressure as long as point 1 stands

If the current way of advancing towards the goal is sub-optimal, giving up on the goal is not the only answer, we can also change the way we go about it. Since getting AGI right is important, not giving up and changing the way we go about it seems like the better option (all this predicated on the snob and doommish depictions in the post being accurate).

Reply1
4Valentine12d
I agree with you. My best guess is that the degree of doom is exaggerated but not fabricated. The exaggeration matters because if it's there, it's warping perception about what to do about the real problem. So if it's there, it would be ideal to address the cause of the exaggeration, even though on the inside that's probably always going to feel like the wrong thing to focus on. In the end, I think a healthy attitude looks more like facing the darkness hand-in-hand with joy in our hearts and music in our throats.
[-]AnthonyC15d133

I agree that fixating on doom is often psychologically unhealthy and pragmatically not useful. I also agree with those commenters pointing out that 2028 probably will look mostly normal, even under many pretty short timelines scenarios. 

So... why advocate waiting until 2028? It is entirely possible to chill out now, without sacrificing any actual work you're doing to make the future go better, while also articulating or developing a more positive vision of the future. 

It took me about a decade to get through the combined despair of learning about x-risk plus not fully living up to some other things I once convinced myself I was morally obligated to do to consider myself a good person. I'm now a more useful, pleasant, happy, and productive person, more able to actually focus on solving problems and taking actions in the world.

Reply2
4Valentine14d
I think I agree. To answer your question: I'd prefer we didn't. But my impression at this point is that Less Wrong as a community needs to dialogue with models carefully to be willing to adopt them, and "Let's talk about it right now to sort out whether to adopt it right now" requires quite a lot of social buy-in. The most norm-respecting way I know to get there is to agree on a prediction that folk agree distinguishes between key possibilities, and then revisit it when the prediction's outcome is determined. But yes, I agree with what I think you're saying. I'd love to see this space take on a more "We're all in this together" attitude toward the rest of humanity, and focus more on the futures we do want. And I think those are things that could happen without waiting until 2028. I also wish folk would take more seriously that their feelings and subtle stuff they encountered preverbally plays a way bigger role than it's going to ever logically appear to. What I see in lots of rationalist spaces, and especially in AI risk spaces, is not subtle in terms of the degree of disembodiment and visceral fear. But I haven't found a way to convey what I'm seeing there without it sounding condescending or disrespectful or something. So, I'm trying for something more like collectively preregistering a prediction and revisiting later. I'm now getting the impression that's not going to work either, but I'm observing what I think is something good coming out of the discussion anyway.
[-]S. Alex Bradt16d111

One is, maybe we can find a way to make these dire predictions less unfalsifiable. Not in general, but specifically AI 2027. What differences should we expect to see if (a) the predictions were distorted due to the trauma mechanism I describe in this post vs. (b) the act of making the predictions caused them not to come about?

If it turns out that the predictions (which ones?) were distorted, I can't think of a reason to privilege the hypothesis that they were distorted by the trauma mechanism in particular.

Reply
1Valentine15d
Okay, at the risk of starting a big side argument that's not cruxy to the OP, I'm going to push for something controversial here: I think it's a mistake to view privileging the hypothesis as a fallacy. Eliezer gives a fictional example like so: Yes, nakedly without any context whatsoever, this is dumb. Why consider Mr. Snodgrass? But with some context, there absolutely is a reason: There's some cause to why the detective suggested Mr. Snodgrass in particular. It's not just that there might be some cause. There must be. Now, there are lots of causes we might not want to allow: * Maybe he had a random number generator pick out a random person and it happend to be Mr. Snodgrass. * Maybe he personally hates Mr. Snodgrass and sees an opportunity to skewer him. * Maybe he knows who did commit the murder and is trying to cover for the real murderer, and can tell that maybe Mr. Snodgrass could be framed for it. But with the exception of the random number generator, there's necessarily information entangled with the choice of poor Mortimer. Any choice of a hypothesis necessarily must be kind of arbitrary. John Vervaeke talks about this: he calls it "relevance realization". It's a pragmatically (if not theoretically) incomputable problem: how do you decide what parts of reality are relevant? As far as I know, all living systems come up with a partial functional answer based on just grabbing some starting point and then improvising, plus adding a bit of random variance so as to catch potentially relevant things that would otherwise be filtered out. So if you add a caveat of "But wait, this is privileging the hypothesis!", you're functionally putting sand in the gears of exploration. I think it's an anti-pattern. An adjacent and much more helpful process is "Huh, what raised this hypothesis to consideration?" Quite often the answer is "I don't know." But having that in mind is helpful! It lets you correct for ways in which the hypothesis can take on a different w
1S. Alex Bradt15d
Thanks for the thorough explanation. You describe a pattern in the community, and then you ask if the dire predictions of AI 2027, specifically, are distorted by that pattern. I think that's what I'm getting stuck on. Imagine if the AI Futures team had nothing to do with the rationality community, but they'd made the same dire predictions. Then 2028 rolls around, and we don't seem to be any closer to doom. People ask why AI 2027 was wrong. Would you still point to trauma-leading-to-doom-fixation as a hypothesis? (Or, rather, would you point to it just as readily?)
3Valentine14d
Er, I don't think that's quite what I meant to say. I think it's close, but this frame feels like it's requiring me to take a stand I don't care that much about. Let me see if I can clarify a bit. My sense is more like, I see a pattern in the community. And I think the inclination to spell out AI 2027 and spread it came from that place. I thought that since AI 2027 was spelling out a pretty specific family of scenarios and in January 2028 we'd be able to tell to what extent those specific scenarios had in fact played out, we could use January 2028 to seriously re-evaluate what's generating these predictions. I'm now learning that the situation is more messy than that, and it's much harder to trigger a "halt, melt, and catch fire" event with regard to producing dire predictions. Alas. I'm hoping we can still do something like it, but it's not as clean and clear as I'd originally hoped. Hopefully that clears up what I was saying and trying to do. Now, you ask: I'm afraid I don't know what "the AI Futures team" is. So I can't really answer this question in full honesty. But with that caveat… I think my answer is "Yes." I'm not that picky about which subset of the community is making dire predictions and ringing the doom bell. I'm more attending to the phenomenon that doomy predictions seem to get a memetic fitness boost purely by being doomy. There are other constraints, because this community really does care about truth-tracking. But I still see a pretty non-truth-tracking incentive on memetic evolution here, and I think it's along an axis that rationalists in general have tended to underestimate the impact of IME. The trauma thing is a specific gearsy model of how doominess could be a memetic fitness boost independent of truth-tracking. But I'm not that picky about specifically the trauma thing being right. I just don't know of another good gearsy model, and the trauma thing sure looks plausible to me. Is that a good-faith response to your question?
1S. Alex Bradt14d
I mean the authors of AI 2027. The AI Futures Project, I should have said. My question was more like this: What if the authors weren't a subset of the community at all? What if they'd never heard of LessWrong, somehow? Then, if the authors' predictions turned out to be all wrong, any pattern in this community wouldn't be the reason why; at least, that would seem pretty arbitrary to me. In reality, the authors are part of this community, and that is relevant (if I'm understanding correctly). I didn't think about that at first; hence the question, to confirm. Definitely good-faith, and the post as a whole answers more than enough.
2Jiro13d
Wouldn't that not change it very much, because the community signal-boosting a claim from outside the community still fits the pattern?
1S. Alex Bradt13d
It would be consistent with the pattern that short-timeline doomy predictions get signal-boosted here, and wouldn't rule out the something-about-trauma-on-LessWrong hypothesis for that signal-boosting. No doubt about that! But I wasn't talking about which predictions get signal-boosted; I was talking about which predictions get made, and in particular why the predictions in AI 2027 were made. Consider Jane McKindaNormal, who has never heard of LessWrong and isn't really part of the cluster at all. I wouldn't guess that a widespread pattern among LessWrong users had affected Jane's predictions regarding AI progress. (Eh, not directly, at least...) If Jane were the sole author of AI 2027, I wouldn't guess that she's making short-timeline doomy predictions because people are doing so on LessWrong. If all of her predictions were wrong, I wouldn't guess that she mispredicted because of something-about-trauma-on-LessWrong. Perhaps she could have mispredicted because of something-about-trauma-by-herself, but there are a lot of other hypotheses hanging around, and I wouldn't start with the hard-to-falsify ones about her upbringing. I realized, after some thought, that the AI 2027 authors are part of the cluster, and I hadn't taken that into account. "Oh, that might be it," I thought. "OP is saying that we should (prepare to) ask if Kokotajlo et al, specifically, have preverbal trauma that influenced their timeline forecasts. That seemed bizarre to me at first, but it makes some sense to ask that because they're part of the LW neighborhood, where other people are showing signs of the same thing. We wouldn't ask this about Jane McKindaNormal." Hence the question, to make sure that I had figured out my mistake. But it looks like I was still wrong. Now my thoughts are more like, "Eh, looks like I was focusing too hard on a few sentences and misinterpreting them. The OP is less focused on why some people have short timelines, and more on how those timelines get signal-boosted
[-]Ben Pace12d103

Come 2028, I hope Less Wrong can seriously consider for instance retiring terms like "NPC" and "normie", and instead adopt a more humble and cooperative attitude toward the rest of the human race.

I am interested in links to the most prominent use of these terms (e.g. in highly-upvoted posts on LW, or by high-karma users). I have a hypothesis that it's only really used on the outskirts or on Twitter, and not amongst those who are more respected. 

I certainly don't use "NPC" (I mean I've probably ever used it, but I think probably only a handful of times) because it aggressively removes agency from people, and I think I've only used "normie" for the purpose of un-self-serious humor – I think it's about as anti-helpful term as "normal person" which (as I've written before) I believe should almost always be replaced by referring to a specific population.

Reply
2Valentine7d
Good call. I haven't been reading Less Wrong in enough detail for a while to pull this up usefully. My impression comes from in-person conversations plus Twitter interactions. The thickest use of my encountering these terms in rationality circles was admittedly about a decade ago. But I'm not sure how much of that is due to my not spending as much time in rationality circles versus discourse norms moving on. I still encounter it almost solely from folk tied to LW-style rationality. I don't recall hearing you use the terms in ways that bothered me this way, FWIW.
1Michael Roe6d
I think “NPC” in that sense is more used by the conspiracy theory community than rationalists. With the idea being that only the person using the term is smart enough to realize that e.g. the Government is controlled by lizards from outer space, and everyone else just believes the media. The fundamental problem with the term is that you might actually be wrong about e.g. the lizards from outer space, and you might not be as smart as you think.
[-]TsviBT13d82

A couple remarks:

I'm skeptical of the the trauma model. I think personally I'm more grim and more traumatized than most people around here, and other people are too happy and not neurotic enough to understand how fucked alignment research is; and yet I have much longer timelines than most people around here. (Not saying anyone should become less happy or more neurotic lol.)

Re/ poor group epistemology, we'll agree that the overall phenomenon is multifactorial. Partially it's higher-order emergent deference. See "Dangers of deference".

I agree that people wan... (read more)

Reply
2Valentine12d
Two notes: * I think this is a little evidence against the trauma model, but not much. Most forms of trauma don't cause people to become AI doomers, just like most forms of trauma don't cause most people to become alcoholics. I think the form of the trauma, and the set of coping mechanisms, and the set of life opportunities, all have to converge to result in this specific flavor of doom focus. (And I hypothesize that LW has long been an attractor for people who've been hit with that set!) * I don't care that much about the trauma model in particular. I should have been clearer in the OP. What I meant was more like, "Gosh it sure seems to me that fixating on doom seems to be a drive independent of truth. Keeps happening all over the place. Sure looks like that's maybe happening here too. That seems important." The trauma thing was meant to both (a) highlight the type of phenomenon I'm talking about and (b) give an example of a loosely gearsy mechanism for producing it. I think you're offering a slightly different model — which is great! I think they could be empirically distinguished (and aren't mutually exclusive, and the whole scene is probably multifaceted). Overall I like your comment.
[-]Viliam15d73

Those who believe the world will end in 2027 should commit to chill out in 2028.

Those who believe the world will end in 2030 or later should take a break in 2026, and then continue working.

Reply
[-]DaemonicSigil16d70

Registering now that my modal expectation is that the situation will mostly look the same in 2028 as it does today. (To give one example from AI 2027, scaling neuralese is going to be hard, and while I can imagine a specific set of changes that would make it possible, it would require changing some fairly fundamental things about model architecture which I can easily imagine taking 3 years to reach production. And neuralese is not the only roadblock to AGI.)

I think one of your general points is something like "slow is smooth, smooth is fast" and also "cooperative is smooth, smooth is fast", both of which I agree with. But the whole "trauma" thing is too much like Bulverism for my taste.

Reply1
2Valentine15d
I had to look up "Bulverism". I think you're saying I'm maybe bringing in a frame that automatically makes people who disagree with me wrong because they're subject to the thing I'm talking about. Yes? I'm pretty sure I don't mean that. My hope is that we can name some reasonably clear signs of what world we'd expect to see if I'm loosely right vs. simply wrong, and then check back when we discover which world we're in in the future. It might matter to point out when people's reactions claim to be inconsistent with the trauma model but in fact are consistent with it. But that's not to make it unfalsifiable. Quite the opposite: it's to make its falsification effective instead of illusory. I think that's quite different from Bulverism at a skim. Let me know if I've missed your point.
1roha6d
I also had to look it up and got interested in testing whether or how it could apply. Here's an explanation of Bulverism that suggests a concrete logical form of the fallacy: 1. Person 1 makes argument X. 2. Person 2 assumes person 1 must be wrong because of their Y (e.g. suspected motives, social identity, or other characteristic associated with their identity). 3. Therefore, argument X is flawed or not true. Here's a possible assignment for X and Y that tries to remain rather general: * X = Doom is plausible because ... * Y = Trauma / Fear / Fixation Why would that be a fallacy? Whether an argument is true or false, depends on the structure and content of the argument, but not on the source of the argument (genetic fallacy), and not on a property of the source that gets equated with being wrong (circular reasoning). Whether an argument for doom is true, does not depend on who is arguing for it, and being traumatized does not automatically imply being wrong.   Here's another possible assignment for X and Y that tries to be more concrete. To be able to do so, "Person 1" is also replaced by more than one person, now called "Group 1": * X (from AI 2027) = A takeover by an unaligned superintelligence by 2030 is plausible because ... * Y (from the post) = "lots of very smart people have preverbal trauma" and "embed that pain such that it colors what reality even looks like at a fundamental level", so "there's something like a traumatized infant inside such people" and "its only way of "crying" is to paint the subjective experience of world in the horror it experiences, and to use the built-up mental edifice it has access to in order to try to convey to others what its horror is like". From looking at this, I think the post suggests a slightly stronger logical form that extends 3: 1. Group 1 makes argument X. 2. Person 2 assumes group 1 must be wrong because of their Y (e.g. suspected motives, social identity, or other characteristic associated with th
[-]Michael Roe15d50

You know, I’m kind of a sceptic on AI doom. I think mst likely paths we end up ok.


But … this feeling that this post talks about. This feeling that really has nothing to do with AI … yes, ok, I feel that sometimes.


I don’t know man. I hope that the things I have done during my time on earth will be of use to someone. That’s all, I’d like, really.

Reply
4sarahconstantin13d
I've had an anxiety disorder my whole life, and am also not that worried about AI, so yeah, no shit, it's possible to be anxious & gloomy about lots of things besides AI! 
1Michael Roe15d
And you know, the whole Covid pandemic thing was kind of horrible. As it turned out, we mostly dodged a bullet and the fatality rate wasn’t that high. But, I suspect the lockdowns had a psychological effect some of us are still suffering from. Like, we’re being influenced by a trauma that is nothing to do with AI.
[-]ouguoc15d132

I was a little surprised by "fatality rate wasn't that high", so I looked it up, and Wikipedia ranks it as the fifth-worst epidemic/pandemic of all time by number of deaths, right behind the Black Death. As a percentage of world population, looks like it'd be like 0.2-0.4%? I'm honestly not sure where that leaves me personally re: "dodged a bullet", just thought I'd share.

Reply
4Jiro15d
Number of deaths is misleading because of the higher world population.
4Michael Roe15d
Well, we’re kind of lucky the fatality rate wasn’t an order of magnitude higher, was was I was getting at.
1Michael Roe15d
But anyway, the point I was getting at is that people are traumatized from something unrelated to AI.
[-]Eli Tyre13d41

I personally found this post less annoying than your previous post on similar themes.

That one felt, to me, like it was more aggressively projecting a narrative onto me and us. This post feels, to me, feels less like it's doing that, and so is more approachable.

Reply
9Eli Tyre13d
And this comment felt even more approachable than the post. I think because it's owning your experience more? You're less making a claim about what's happening with other people's psychologies, and more like talking about 1) a selection mechanism that we can discuss from the third person perspective, and 2) your own epistemic vantage point. "It looks to me like you're wrapped up in a trauma narrative, but I'm mostly not going to bother to respond to your object level arguments", is an annoying kind of critique, because there's basically no way to falsify it. "I don't know what your reasons are, and for all I know maybe they're good, but from my perspective, I can't distinguish you being right from you falling into [specific memetic attractor]" is subtly different somehow. I feel resoundingly good about that comment.
[-]Chris_Leong15d4-2

I don't know man, seems a lot less plausible now that we have the CAIS letter and inference-time compute started demonstrating a lot of concrete examples of AIs misbehaving[1].

So before the CAIS letter there were some good outside view arguments and in the ChatGPT 3.5-4 era alignment-by-default seemed quite defensible[2], but now we're in an era where neither of these is the case, so your argument pulls a lot less of a punch.

  1. ^

    The era when it was just pre-training compute being scaled made it look like we had things a lot more under control than we did. I w

... (read more)
Reply
2Valentine15d
There's a logic error I keep seeing here. I agree, there's more reason to think LLMs might not just magically align. That fact does not distinguish between the worlds I'm pointing out though. Here are two worlds: * We're hurdling toward AI doom. * We're subconsciously terrified and are projecting our terror onto the world, causing us to experience our situation as hurdling toward AI doom, and creating a lot of agreement between one another since we're converging on that strategy for projecting our subconscious terror. There are precious few, if any, arguments of the form "But here's a logical reason why we're doomed!" that can distinguish between these two worlds. It doesn't matter how sound the argument is. Its type signature isn't cruxy. I'm not claiming there's no way to distinguish between these worlds, to be clear. I'm pointing out that the approach of "But this doomy reasoning is really, really compelling!" is simply not relevant as a distinguisher here. The whole point of the OP is to suggest that we get clear on the distinction so that we can really tell in 2.5 years, and possibly pivot accordingly. But not to argue that we pivot that way now. (That might make sense, and it might turn out to be something we'll later wish we'd figured out to do sooner. But I'm not trying to argue that here.)
[-]sarahconstantin13d232

You can literally try to find out how bad people feel. Do x-riskers feel bad? Do people feel worse once they get convinced AI x-risk is a thing, or better once they get convinced it isn't?

This is the PANAS test, a simple mood inventory that's a standard psychological instrument. If you don't trust self-report, you can use heart rate or heart rate variability, which most smartwatches will measure. 

Now, to some degree, there's a confounding thing where people like you (very interested in exercise and psychology) might feel better than someone who doesn't have that focus, and all things equal maybe people who focus on x-risk are less likely to focus on exercise/meditation/circling/etc. 

So the other thing is, if people who believe in x-risk take up a mental/physical health practice, you can ask if it makes them feel better, and also if it makes them more likely to stop believing in x-risk.

Reply
4Valentine12d
I love this direction of inquiry. It's tricky to get right because of lots of confounds. But I think something like it should be doable. I love that this is a space where people care about this stuff. I'm poorly suited to actually carry out these examinations on my own. But if someone does, or if someone wants to lead the charge and would like me to help design the study, I'd love to hear about it!
6Vladimir_Nesov15d
The worlds you want to distinguish are those with short term doom and those without. Subconscious terror is the pesky confounder, not the thing you (in the end) want to distinguish. And it can be present or absent in either of the worlds you really want to distinguish, you can have doom+terror, or nodoom+terror, or doom+noterror. In the doom+terror world, distinguishing doom from terror can become a false dichotomy, if the issue with the arguments is not formulated more carefully than just saying it like that. So the goal should be to figure out how to control for terror (if it's indeed an important factor), and an argument shouldn't be "distinguishing between doom and terror", it should still be distinguishing between doom and no (short term) doom, possibly accounting for the terror factor in the process.
2Valentine14d
Analyzing from the outside, I agree. The pesky thing is that we can't fully analyze it from the outside. The analysis itself can be colored by a terror generator that has nothing to do with the objective situation. So if there's reason to think there's a subconscious distortion happening to our collective reasoning, distinguishing between doom and nodoom might functionally have sorting out terror as a prerequisite. Which sucks in terms of "If there's doom then we don't want to waste time working on things that aren't related." But if we literally cannot tell what's real due to distortions in perception, then sorting out those perception errors becomes the top priority. (I'm describing it in black-and-white framing to make the logic clear, not to assert that we literally cannot tell at all what's going on.)
2Vladimir_Nesov14d
When you can't figure something out, you need to act under uncertainty. The question is still doom vs. no short term doom. Even if you conclude "terror", that is only an argument for the uncertainty being unresolvable (with some class of arguments that would otherwise help), not an argument that doom has been ruled out (there still needs to be some prior). The "doom vs. terror" framing doesn't adequately capture this. Since 5-20% doom within 10 years is a relatively popular position, mixing in more nodoom because terror made certain doom vs. nodoom arguments useless doesn't change this state of uncertainty too much, the decision relevant implications remain about the same. That is, it's still worth working on a mixture of short term and longer term projects, possibly even very long term ones that almost inevitably won't be fruitful before takeoff-capable AGI (because we might use the head start to prompt the AGIs into completing such projects before takeoff).
2Chris_Leong15d
Feels like that's a Motte and Bailey? Sure, if you believe that the claimed psychological effects are very, very strong, then that's the case. However, let's suppose you don't believe they're quite that strong. Even if you believe that these effects are pretty strong, then sufficiently strong arguments may still provide decent evidence. In terms of why the evidence I provided is strong: *  There's a well known phenomenon of "cranks" where non-experts end up believing in arguments that sound super persuasive to them, but which are obviously false to experts. The CAIS letter ruled that out. * It's well-known that it's easy to construct theoretical arguments that sound extremely persuasive, but bear no relation to how things work in practise. Having a degree of empirical evidence of some of the core claims greatly weakens these arguments. So it's not just that these are strong arguments. It's that these are arguments that you might expect to provide some signal even if you thought the claimed effect was strong, but not overwhelmingly so.
2Valentine14d
I really don't think so, but I'm not sure why you're saying so, so maybe I'm missing something. If I keep doing something that looks to you like a motte-and-bailey, could you point out specifically the structure? Like, what looks my motte and what looks like my bailey?   Sure, but arguing that doom is real does nothing to say what proportion of the doom models' spread is due to something akin to the trauma model. And that matters because… well, there's a different motte-and-bailey structure that can go "OMG it's SO BAD… but look, here's the very reasonable argument for how it's going to be this particular shape of challenging for us… SO IT'S SUPER BAD!!!" It'll tend to warp how we see those otherwise reasonable arguments, and which ones get emphasized, and how. Listing more such arguments, or illustrating how compelling they are, doesn't do anything to suggest how much this warping phenomenon is or isn't happening. I mean, by analogy: someone can have a panic attack due to being trapped in a burning building. They're going to have a hard time doing anything about their situation while having a panic attack. Arguments that they really truly are in a burning building don't affect the truth of that point, regardless of how compelling those arguments that the building really is on fire are.
2Chris_Leong14d
Between an overwhelmingly strong effect and a pretty strong effect.
[-]Yonatan Cale4d30

Hey, I think I'm the target audience of this post, but it really doesn't resonate well with me.

Here are some of my thoughts/emotions about it:

 

First, things I agree with:

  1. Feeling doom/depressed/sad isn't a good idea. I wish I'd have less of this myself :( it isn't helping anyone
  2. Many people thought the world would end and they were wrong - this should give us a lower prior on ourselves being right about the world maybe ending
  3. Biases are probably in play
  4. I read you as trying to be supportive, which I appreciate, thanks

 

Things that feel bad about:

5. Wa... (read more)

Reply
[-]Max Niederman16d30

I do think that it's valuable to have some kind of deadline which forces us to admit that our timelines were too pessimistic and consider why. However, as other commenters have pointed out I think 2028 is a bit early for that, and I'm not convinced that the traumatized infant model would be the primary reason.

Reply1
2Valentine15d
Whereas I think waiting 2.5 years to consider if we're wasting people's lives on a loud horror show is awfully late. It's obviously the case that if everyone were screaming "We're all going to die by 2035!" and then we get to 2036 and nothing has happened, then yep, time to reconsider. But I don't expect anything like that. I expect that folk will keep making predictions, and they'll keep shifting, and they'll keep being a bit hard to pin down, and everything will keep seeming more dire in a Shepard tone kind of way. Some advances in AI will happen, and the narratives will shift, and it'll always seem like those narratives are shifting for perfectly sensible reasons. (The fact that the timeline arguments seem reasonable is not a crux, by the way. This is a detail of logic I keep seeing people miss. "But my argument is compelling and neither I nor my colleagues notice anything wrong with it" is not a distinguishing feature of the worlds where (a) there's good modeling going on versus (b) there's selective perception shaping a subconscious drive to generate shared narratives of doom. It's not that these two worlds cannot be distinguished, but they cannot be distinguished by e.g. asking ourselves "Are we being reasonable about our timeline arguments?") So I kind of expect that even if most people were to agree that we could definitely tell whether our predictions had come about by 2036, if we get to 2036 and no doom has manifest, I still don't expect much about the narrative tone to change. So I'm wondering if there's a way we could tell sooner than that. What would we see in 2028 that could give us pause and go "Huh. Maybe we need to take much more seriously that something could be distorting our emotional lens here"? I get awfully suspicious if there's literally no way to tell for at least a decade. That starts to sound an awful lot like astrology.   Cool. I'm not sure what to do with that, but acknowledged.
1Max Niederman14d
I 100% agree that this is what we would expect in worlds where AI keeps advancing but never becomes really dangerous, and am in favor of operationalizing this with a mutually agreed-upon time to reevaluate the possibility of emotional biases. However, I strongly disagree with the suggestion of 2028. Assume, for a moment, that scaring people into action is the correct response to imminent doom. Then unless we are extremely confident that there is no imminent doom, we should still be scaring people. Scaring people is bad and we want to avoid it, but we want to avoid extinction a whole lot more. Given that, I think it makes the most sense to go with something like a 99th percentile date. The community's aggregate 99th percentile date for truly manifest doom does seem to be at least that far away, but we could possibly use other events to get quicker results. Note that this is all conditioned on scaring people into action being the the best action conditioned on imminent doom, which I am unsure about. Sorry, it's a bad habit of mine to unproductively say whether I agree or disagree with something without giving reasoning. The reason I am unconvinced is that, although it does seem to fit at first glance, you don't give much evidence that this particular emotional mechanism is involved as opposed to others. For example, I think that the desire to be part of a secret, specially enlightened in-group is likely a big factor as well. 1. ^ I think this is a great metaphor, by the way.
7Valentine14d
Noted. FWIW I'm seriously skeptical of whether it's true. For two main reasons: 1. The kind of "imminent doom" we're talking about is years away. People can't stay actively scared and useful on timescales that long. I think they go numb and adjust instead. 2. Fear distorts thinking. Saying "But it's dire!" as an argument for scaring people creates a memetic evolutionary incentive to make that claim. You can severely warp the epistemic commons if the norms around pulling the fire alarm let it happen too easily. With that said: I think if we were pretty confident that AI doom were coming in the next year, then freaking people out about it might make a lot of sense. But I think the bar for making that call should be very high. Otherwise bad memes eat us all.   Yep, sounds like that'd be the wholesome and good-faith thing to do at this point, best as I can tell.   True. I was trying to suggest a gearsy model of how impressions of doom might get exaggerated and warp perceptions. I've seen stuff very much like the trauma model play out best as I can tell. So it seemed like a plausible model here. It's my go-to default. If you or someone else would care to spell out a different gearsy model, and if we can distinguish between them, I'd be quite happy to hear it and look at the truth. In particular you gesture at: Intuitively that seems plausible to me. At a glance I think it'd put a different set of incentives on memetic evolution, but that they'd still warp shared vision away from truth. I'm running out of reply steam, so I'm not going to go much deeper than I just have here. But I want to note that I think your idea makes some sense, and I bet we could tell empirically to what extent it vs. the trauma thing is happening. The main thing I keep seeing missed when coming up with experiments is, we're talking about subjective structures. Which means that we're inside the things we're trying to analyze. Which means that the scissors we construct to distinguish be
5Jiro14d
This sort of reasoning leads to Pascal's Mugging and lots of weirdness like concern for the welfare of insects or electrons. We should not be acting on the basis of extremely unlikely events because the events lead to a very large change in utility.
1Max Niederman14d
This just changes the quantile you should use. I already tried to adjust for this in the ballpark number I gave — I in fact think existential doom is much more than 99 times worse than scaring people. Even if you disagree and say it’s not worth trying to avoid an outcome because it has only a 1% probability, you’d probably agree it’s worth it at 50%, and it seems to me the median timeline is still way past 2028.
[-]Michael Roe16d27

I do have considerable sympathy for the view in this post that the feeling we’re about to all die is largely decoupled from whether we are, in fact, about to die. There are potentially false negatives as well as false positives here.

Reply
[-]adamsky7h10

I agree focusing on the negative, and on how to avoid stuff rather how to do stuff is counterproductive. But what is this positive vision of the world we should strive for? :-) 

Reply
[-]Michael Roe15d10

I was just explaining this post to my partner. Now, although I put AI extinction  as low probability, I have a thyroid condition. Usually treatable: drugs like carbimazole, radio iodine, surgery etc. in my case, complications make things somewhat worse than is typical. So, she just asked how to rate how likely I think it is I don’t, personally, make it to 2028 for medical reasons, I’m like, idk, I guess maybe 50% chance I don’t make it that far. I shall be pleasantly surprised if I make it. Kind of surprised I made it to July this year, to be honest.

Reply2
1Michael Roe15d
In one of my LLM eval tests, DeepSeek R1 generated the chapter headings of a parody of a book about relationships, including the chapter  Intimacy Isn’t Just Physical: The Quiet Moments That Matter. Now, ok, DeepSeek is parodying that type of book here, but also, it’s kind of true. When you look back on it, it is the “quiet moments” that mattered, in the end, (Default assistant character in most mainstream LLMs is hellish sycophantic, so I ought to add here that when i mentioned this to DeepSeek emulating a sarcastic squirrel out of a Studio Ghibli movie, it made gagging noises about my sentimentality. So there’s that, too.)
1Michael Roe15d
I know you aren’t real, little sarcastic squirrel, but sometimes the things you say have merit to them, nevertheless.
[-]Michael Roe16d11

I do not expect us to be all dead by 2028.


2028 outcomes I think likely:

A) LLMs hit some kind of wall (e.g. only so much text to train on), and we don’t get AGI.

B) We have, approximately, AGI, but we’re not dead yet. The world is really strange though.


Outcomes (b) either works out ok, or we died some time rather later than 2028.

Reply
2Valentine15d
FWIW, I don't think we're all going to be dead by 2028 either. I bet we won't all be dead by 2040 for that matter. But that's kind of a silly bet because I have no skin in the game: no one can collect if I'm wrong!
[-]Knight Lee16d*01

The gist is to frighten people into action. Usually into some combo of (a) donating money and (b) finding ways of helping to frighten more people the same way. But sometimes into (c) finding or becoming promising talent and funneling them into AI alignment research.

If spending $0.1 billion on AI risk is too much money and promising talent, then surely spending $800 billion on military defence is a godawful waste of promising talent. Would you agree that the US should cut its military budget by 99%, down to $8 billion?

Similar risk at 2028 doesn't prove a lo

... (read more)
Reply1
9Valentine15d
I'm not following your thinking. To answer your question: probably no? I suspect that the US (and a lot of the world) would do quite badly if the US military's budget were cut down by 99%. But I get the impression you're doing this as a kind of logical gotcha…? And I don't see the connection. Maybe you think I'm saying that we're putting too much energy into AI safety research? That's not what I'm saying at all. It's irrelevant to what I'm saying. What I'm saying is more like, if the path to getting more funding for AI safety research goes through mainly frightening people into it, then I worry about it creating bad incentives and really unwholesome effects, and thereby maybe not resulting in much good AI risk mitigation. I think I'm missing your point though. Could you clarify?   I like this analogy. The logic is similar. I'm wanting to ask something like "Hey, if we keep betting on most new cars being self driving, and we keep being wrong, when can we pause and collectively reconsider how we're doing this? How about thus-and-such time given thus-and-such circumstances; does that work?"
1Knight Lee15d
Retracted Are you're referring to those, weird coercive EA charities which try to guilt people into supporting their cause right? If you are, I see what you mean. People should avoid them and perhaps warn others against them. However I feel this isn't very clear in your post, you were saying that: This sounded you were talking about the typical organization, not the most coercive ones. The typical organization does not try to terrify its readers. It does not say you will die and your family will die. It makes the same kind of sober-minded argument that reducing this risk is very cost effective and urgent, that the military makes. A lot of charities appeal at least a little to urgency and guilt. Look at this video by Against Malaria, which is considered a very reputable charity (iirc almost every dollar is spent on buying nets, no executive salaries). Is this unwholesome? It does make you feel bad. But the very nature of human empathy and conscience is designed to feel bad, not good or wholesome. If we are not willing to feel bad to get things done, should we try and cure empathy and conscience? Everything needs to be taken in moderation. Yes, charities working on AI risk should avoid sensationalism and terror! But that doesn't mean the typical dollar spent and hour worked on AI safety is "being frightened into it," while somehow dollars spent and hours worked on military defence isn't. The average promising talent who joins the military experiences far more fear than the average promising talent who works in AI safety. Even those who never face active combat are still forced to think about the possibility all day, and other unwholesome things happen in the military. This is an argument that we should invest in their well-being, not an argument that we should dramatically tone down the military defence priority or urgency. ---------------------------------------- I think if people repeatedly delay their predictions, it's reasonable to suspect the predicti
2Knight Lee15d
Out of curiosity, I looked at your post "Here's the exit." Although I didn't agree with it, all the people harshly criticizing you really made me really feel your frustrations. Yes, this is the LessWrong echo chamber. The internet forum hivemind. I feel very sorry for being sarcastic etc. you've obviously received too much of that from others. Reading their accusations, I start to feel like I'm on your side :/ and my biased brain decided to remember all the examples of what you're talking about. I remember all these discussions where people didn't just objectively believe that P(doom) was high, they were sort of rude and antagonistic about it. I remember some people villainizing AI researchers similarly to how militant vegans villainize meat eaters. It doesn't even work, you don't convince meat eaters by telling them they're monsters, you're supposed to be empathetic. I really wish some of those people should chill out more, and I think your post is very good. (Strangely, I can't find these examples right now. All I found was this comment, which wasn't that bad, and the replies to your "Here's the exit" post.)
5philh16d
Anecdote: in 2022, my recollection is that Ethereum had been planning to switch to proof of stake for years, and that project had been repeatedly delayed. In June, my brother bet me that it wouldn't happen for at least another two years. It actually happened in September 2022.
[-]roha12d-21

I'm close to getting a postverbal trauma from having to observe all the mental gymnastics around the question of whether building a superintelligence without having reliable methods to shape its behavior is actually dangerous. Yes, it is. No, that fact does not depend on whether Hinton, Bengio, Russell, Omohundro, Bostrom, Yudkowsky, et al. were held as a baby.

Reply2
3roha11d
Point addressed with unnecessarily polemic tone: * "Suppose that what's going on is, lots of very smart people have preverbal trauma." * "consider the possibility that the person in question might not be perceiving the real problem objectively because their inner little one might be using it as a microphone and optimizing what's "said" for effect, not for truth." It is alright to consider it. I find it implausible that a wide range of accomplished researchers lay out arguments, collect data, interpret what has and hasn't been observed and come to the conclusion that our current trajectory of AI development poses a significant amount of existential risk, which can potentially manifest in short timelines, because a majority of them has a childhood trauma that blurs their epistemology on this particular issue but not on others where success criteria could already be observed.
6Noosphere8910d
I agree that @Valentine's specific model here is unlikely to fit the data well here, but to be charitable to Valentine/steelman the post, the better nearby argument is that hypotheses where astronomical value, in either negative or positive directions are memetically fit and very importantly believed by lots of people and lots of people take serious actions that are later revealed to be mostly mistakes because the hypothesis of doom/salvation by something had gotten too high a probability inside their brains relative to an omniscient observer. Another way to say it is that the doom/salvation hypotheses aren't purely believed because they have evidence for the hypothesis directly. This is a necessary consequence of humans needing to make expected utility decisions all the time, combined with both their values/utility functions mostly not falling in value fast enough with increasing resources to avoid the conclusion that unboundedly valuable states exist for a human, and the humans being bounded reasoners/performing bounded rationality that means they cannot distinguish probabilities between say 1 in a million and 0 finely. However, another partial explanation comes from @Nate Showell, where pessimism is used as a coping mechanism to not deal with personal scale problems, and in particular believing that the world is doomed from something is a good excuse to not deal with stuff like doing the dishes or cleaning your bedroom, and it's psychologically appealing to have a hypothesis that means you don't have to do any mundane work to solve the problem: https://www.lesswrong.com/posts/D4eZF6FAZhrW4KaGG/consider-chilling-out-in-2028#5748siHvi8YZLFZih And this is obviously problematic for anyone working on getting the public to believe an existential risk is real, if there is in fact real evidence something poses an x-risk. Here, an underrated cure by Valentine is to focus on the object level, and to focus as much on empirical research as possible, because this way yo
8roha10d
"it's psychologically appealing to have a hypothesis that means you don't have to do any mundane work" I don't doubt that something like inverse bike-shedding can be a driving force for some individuals to focus on the field of AI safety. I highly doubt it is explanatory for the field and the associated risk predictions to exist in the first place, or that its validity should be questioned on such grounds, but this seems to happen in the article if I'm not entirely misreading it. From my point of view, there is already an overemphasis on psychological factors in the broader debate and it would be desirable to get back to the object level, be it with theoretical or empirical research, which both have their value. This latter aspect seems to lead to a partial agreement here, even though there's more than one path to arrive at it.
5Valentine7d
Not entirely. It's a bit of a misreading. In this case I think the bit matters though. (And it's an understandable bit! It's a subtle point I find I have a hard time communicating clearly.) I'm trying to say two things: * There sure do seem to be some bad psychological influences going on. * It's harder to tell what's real when you have sufficiently bad psychological influences going on. I think some people, such as you, are reacting really strongly to that second point. Like I'm taking a stand for AI risk being a non-issue and saying it's all psychological projection. I'm saying that nonzero, but close to zero. It's a more plausible hypothesis to me than I think it is to this community. But that's not because I'm going through the arguments that AI risk is real and finding refutations. It's because I've seen some shockingly basic things turn out to be psychological projection, and I don't think Less Wrong collectively understands that projection really can be that deep. I just don't see it accounted for in the arguments for doom. But that's not the central point I'm trying to make. My point is more that I think the probability of doom is significantly elevated as a result of how memetic evolution works — and, stupidly, I think that makes doom more likely as a result of the "Don't hit the tree" phenomenon. And maybe even more centrally, you cannot know how elevated the probability is until you seriously check for memetic probability boosters. And even then, how you check needs to account for those memetic influences. I'm not trying to say that AI safety shouldn't exist as a field though.   Wow, you and I sure must be seeing different parts of the debate! I approximately only hear people talking about the object level. That's part of my concern. I mean, I see some folk doing hot takes on Twitter about psychological angles. But most of those strike me as more like pot shots and less like attempts to engage in a dialogue.
5Valentine7d
This was a great steelmanning, and is exactly the kind of thing I hope people will do in contact with what I offer. Even though I don't agree with every detail, I feel received and like the thing I care about is being well enough held. Thank you.
[-]arisAlexis14d-3-3

Let's discuss for now, and then check in about it in 31 months.

 

I really don't like these kind of statements because it's like a null bet. Either the world has gone to hell and nobody cares about this article or author has "I was correct, told ya" rights. I think these kind of statements should not be made in the context of existential risk.

Reply
[-]arisAlexis14d-4-3

but you need to form this not like any other argument but like "first time in history of earth life, a species has created a new superior species". I think all these refutals are missing this specific point. This time is different.

Reply
2Valentine14d
I think your logic slipped. I'm not saying creating AI wouldn't be different. I'm saying that the generator of doom predictions keeps predicting doom, and then hedging, and then re-predicting, while insisting that its new prediction is "really actually different this time". Just like the last twenty times. But no, really, this time is really actually truly different! If you're talking about a real doom, such as building a superior species, and if it's actually going to happen at some point, then yes at some point those doom predictions will in fact pan out. But if the generator of doom predictions is optimizing for feeling and expressing doom more than making accurate predictions, then it's close to accidental that it ends up right at some point. I don't think it's quite that extreme here. But saying "No, creating AI really would be different!" doesn't affect the reasoning whatsoever. That just makes it a potent source of viral doom memes.
-3arisAlexis14d
Although I don't like comments starting with "your logic slipped" because it sounds passive-aggressive "you are stupid" vibes I will reply.  So what you are saying is that yes this time is different just not today. It will definately happen and all the doomerism is correct but not on a short timeline because ____ insert reasoning that is different than what the top AI minds are saying today.  This is actually and very blatantly a self preserving mechanism that is called "norlmancy bias" very well documented for human species.
2Valentine12d
Sorry, that's not how I meant it. I meant it more like "Oh, I think your foot slipped there, so if you take another step I think it won't have the effect you're looking for." We can all slip up. It's intended as a friendly note. I agree that on rereading it it didn't come across that way.   Uh, no. That's not what I'm saying. I'm saying something more like: if it turns out that doomerism is once again exaggerated, perhaps we should take a step back and ask what's creating the exaggeration instead of plowing ahead as we have been.
1arisAlexis8d
how can you know if it's exaggerated? It's like an earthquake. The fact that it didn't happen yet doesn't mean that it will not be destructive if it happens through time. The superintelligence slope doesn't stop somewhere to evaluate nor do we have any kind of signal that the more time passes the more improbable it is.
[+]lumpenspace15d-51
[+][comment deleted]13d20
Moderation Log
Curated and popular this week
140Comments
Deleted by sarahconstantin, 06/24/2025

I'll explain my reasoning in a second, but I'll start with the conclusion:

I think it'd be healthy and good to pause and seriously reconsider the focus on doom if we get to 2028 and the situation feels basically like it does today.

I don't know how to really precisely define "basically like it does today". I'll try to offer some pointers in a bit. I'm hoping folk will chime in and suggest some details.

Also, I don't mean to challenge the doom focus right now. There seems to be some good momentum with AI 2027 and the Eliezer/Nate book. I even preordered the latter.

But I'm still guessing this whole approach is at least partly misled. And I'm guessing that fact will show up in 2028 as "Oh, huh, looks like timelines are maybe a little longer than we once thought. But it's still the case that AGI is actually just around the corner…."

A friend described this narrative phenomenon as something like the emotional version of a Shepard tone. Something that sure seems like it's constantly increasing in "pitch" but is actually doing something more like looping.

(The "it" here is how people talk about the threat of AI, just to be clear. I'm not denying that AI has made meaningful advances in the last few years, or that AI discussion became more mainstream post LLM explosion.)

I'll spell out some of my reasoning below. But the main point of my post here is to be something folk can link to if we get to 2028 and the situation keeps seeming dire in basically the same increasing way as always. I'm trying to place something loosely like a collective stop loss order.

Maybe my doing this will be irrelevant. Maybe current efforts will sort out AI stuff, or maybe we'll all be dead, or maybe we'll be in the middle of a blatant collapse of global supply chains. Or something else that makes my suggestion moot or opaque.

But in case it's useful, here's a "pause and reconsider" point. Available for discussion right now, but mainly as something that can be remembered and brought up again in 31 months.

Okay, on to some rationale.

 

Inner cries for help

Sometimes my parents talk about how every generation had its looming terror about the end of the world. They tell me that when they were young, they were warned about how the air would become literally unbreathable by the 1970s. There were also dire warnings about a coming population collapse that would destroy civilization before the 21st century.

So their attitude upon hearing folks' fear about overpopulation, and handwringing around Y2K, and when Al Gore was beating the drum about climate change, and terror about the Mayan calendar ending in 2012, was:

Oh. Yep. This again.

Dad would argue that this phenomenon was people projecting their fear of mortality onto the world. He'd say that on some level, most people know they're going to die someday. But they're not equipped to really look at that fact. So they avoid looking, and suppress it. And then that unseen yet active fear ends up coloring their background sense of what the world is like. So they notice some plausible concerns but turn those concerns into existential crises. It's actually more important to them at that point that the problems are existential than that they're solved.

I don't know that he's right. In particular, I've become a little more skeptical that it's all about mortality.

But I still think he's on to something.

It's hard for me not to see a similar possibility when I'm looking around AI doomerism. There's some sound logic to what folk are saying. I think there's a real concern. But the desperate tone strikes me as… something else. Like folk are excited and transfixed by the horror.

I keep thinking about how in the 2010s it was extremely normal for conversations at rationalist parties to drift into existentially horrid scenarios. Things like infinite torture, and Roko's Basilisk, and Boltzmann brains. Most of which are actually at best irrelevant to discuss. (If I'm in a Boltzmann brain, what does it matter?)

Suppose that what's going on is, lots of very smart people have preverbal trauma. Something like "Mommy wouldn't hold me", only from a time before there were mental structures like "Mommy" or people or even things like object permanence or space or temporal sequence. Such a person might learn to embed that pain such that it colors what reality even looks like at a fundamental level. It's a psycho-emotional design that works something like this:

"Dependency"

If you imagine that there's something like a traumatized infant inside such people, then its primary drive is to be held, which it does by crying. And yet, its only way of "crying" is to paint the subjective experience of world in the horror it experiences, and to use the built-up mental edifice it has access to in order to try to convey to others what its horror is like.

If you have a bunch of such people getting together, reflecting back to one another stuff like 

OMG yes, that's so horrid and terrifying!!!

…then it feels a bit like being heard and responded to, to that inner infant. But it's still not being held, and comforted. So it has to cry louder. That's all it's got.

But what that whole process looks like is, people reflecting back and forth how deeply fucked we are. Getting consensus on doom. Making the doom seem worse via framing effects and focusing attention on the horror of it all. Getting into a building sense of how dire and hopeless it all is, and how it's just getting worse.

But it's from a drive to have an internal agony seen and responded to. It just can't be seen on the inside as that, because the seeing apparatus is built on top of a worldview made of attempts to get that pain met. There's no obvious place outside the pain from which to observe it.

I'm not picky about the details here. I'm also not sure this is whatsoever what's going on around these parts. But it strikes me as an example type in a family of things that's awfully plausible.

It's made even worse by the fact that it's possible to name real, true, correct problems with this kind of projection mechanism. Which means we can end up in an emotional analogue of a motte-and-bailey fallacy: attempts to name the emotional problem get pushed away because naming it makes the real problem seem less dire, which on a pre-conceptual level feels like the opposite of what could possibly help. And the arguments for dismissing the emotional frame get based on the true fact that the real problem is in fact real. So clearly it's not just a matter of healing personal trauma!

(…and therefore it's mostly not about healing personal trauma, so goes the often unstated implication (best as I can tell).)

But the invitation is to address the doom feeling differently, not to ignore the real problem (or at least not indefinitely). It's also to consider the possibility that the person in question might not be perceiving the real problem objectively because their inner little one might be using it as a microphone and optimizing what's "said" for effect, not for truth.

I want to acknowledge that if this isn't at all what's going on in spaces like Less Wrong, it might be hard to demonstrate that fact conclusively. So if you're really quite sure that the AI problem is basically as you think it is, and that you're not meaningfully confused about it, then it makes a lot of sense to ignore this whole consideration as a hard-to-falsify distraction.

But I think that if we get to 2028 and we see more evidence of increasing direness than of actual manifest doom, it'll be high time to consider that internal emotional work might be way, way, way more central to creating something good than is AI strategizing. Not because AI doom isn't plausible, but because it's probably not as dire as it always seems, and there's a much more urgent problem demanding attention first before vision can become clear.

 

Scaring people

In particular, it strikes me that the AI risk community orbiting Less Wrong has had basically the same strategy running for about two decades. A bunch of the tactics have changed, but the general effort occurs to me as the same.

The gist is to frighten people into action. Usually into some combo of (a) donating money and (b) finding ways of helping to frighten more people the same way. But sometimes into (c) finding or becoming promising talent and funneling them into AI alignment research.

That sure makes sense if you're trapped in a house that's on fire. You want the people trapped with you to be alarmed and to take action to solve the problem.

But I think there's strong reason to think this is a bad strategy if you're trapped in a house that's slowly sinking into quicksand over the course of decades. Not because you'll all be any less dead for how long it takes, but because activating the fight-or-flight system for that long is just untenable. If everyone gets frightened but you don't have a plausible pathway to solving the problem in short order, you'll end up with the same deadly scenario but now everyone will be exhausted and scared too.

I also think it's a formula for burnout if it's dire to do something about a problem but your actions seem to have at best no effect on said problem.

I've seen a lot of what I'd consider unwholesomeness over the years that I think is a result of this ongoing "scare people into action about AI risk" strategy. A ton of "the ends justify the means" thinking, and labeling people "NPCs", and blatant Machiavellian tactics. Inclusion and respect with words but an attitude of "You're probably not relevant enough for us to take seriously" expressed with actions and behind closed doors. Amplifying the doom message. Deceit about what projects are actually for.

I think it's very easy to lose track of wholesome morality when you're terrified. And it can be hard to remember in your heart why morality matters when you're hopeless and burned out.

(Speaking from experience! My past isn't pristine here either.)

Each time it's seemed like "This could be the key thing! LFG!!!" So far the results of those efforts seem pretty ambiguous. Maybe a bunch of them actually accelerated AI timelines. It's hard to say.

Maybe this time is different. With AI in the Overton window and with AI 2027 going viral, maybe Nate & Eliezer's book can shove the public conversation in a good direction. So maybe this "scare people into action" strategy will finally pay off.

But if it's still not working when we hit 2028, I think it'll be a really good time to pause and reconsider. Maybe this direction is both ineffective and unkind. Not as a matter of blame and shame; I think it has made sense to really try. But 31 months from now, it might be really good to steer this ship in a different direction, as a pragmatic issue of sincerely caring for what's important to us all going forward.

 

A shared positive vision

I entered the rationality community in 2011. At that time there was a lot of excitement and hope. The New York rationalist scene was bopping, meetups were popping up all over the world, and lots of folk were excited about becoming . MIRI (then the Singularity Institute for Artificial Intelligence) was so focused on things like the Visiting Summer Fellows Program and what would later be called the Rationality Mega Camp that they weren't getting much research done.

That was key to what created CFAR. There was a need to split off "offer rationality training" from "research AI alignment" so that the latter could happen at all.

(I mean, I'm sure some was happening. But it was a pretty big concern at the time. Some big donors were getting annoyed that their donations weren't visibly going to the math project Eliezer was so strongly advocating for.)

At the time there was shared vision. A sense that more was possible. Maybe we could create a movement of super-competent super-sane people who could raise the sanity waterline in lots of different domains, and maybe for the human race as a whole, and drown out madness everywhere that matters. Maybe powerful and relevant science could become fast. Maybe the dreams of a spacefaring human race that mid 20th century sci-fi writers spoke of could become real, and even more awesome than anyone had envisioned before. Maybe we can actually lead the charge in blessing the universe with love and meaning.

It was vague as visions go. But it was still a positive vision. It drove people to show up to CFAR's first workshops for instance. Partly out of fear of AI, sure, but at least partly out of excitement and hope.

I don't see or hear that kind of focus here anymore. I haven't for a long time.

I don't just mean there's cynicism about whether we can go forth and create the Art. I watched that particular vision decay as CFAR muddled along making great workshops but turning no one into Ender Wiggin. It turns out we knew how to gather impressive people but not how to create them.

But that's just one particular approach for creating a good and hopeful future.

What I mean is, nothing replaced that vision.

I'm sure some folk have shared their hopes. I have some. I've heard a handful of others. I think Rae's feedbackloop-first rationality is a maybe promising take on the original rationality project.

But there isn't anything like a collective vision for something good. Not that I'm aware of.

What I hear instead is:

  • AI will probably kill us all soon if we don't do something. Whatever that "something" is.
  • If anyone builds it, everyone dies. And right now lots of big powerful agents are racing to build it.
  • We're almost certainly doomed at this point, and all that's left is to die with dignity.
  • Is it ethical to have children right now, since they probably won't get to grow up and have lives?
  • No point in saving for retirement. We won't live that long.

It reminds me of this:

Very young children (infants & toddlers) will sometimes get fixated on something dangerous to them. Like they'll get a hold of a toxic marker and want to stick it in their mouth. If you just stop them, they'll get frustrated and upset. Their whole being is oriented to that marker and you're not letting them explore the way they want to.

But you sure do want to stop them, right? So what do?

Well, you give them something else. You take the marker away and offer them, say, a colorful whisk.

It's no different with dogs or cats, really. It's a pretty general thing. Attentional systems orient toward things. "Don't look here" is much harder than "Look here instead."

So if you notice a danger, it's important to acknowledge and address, but you also want to change orientation to the outcome you want.

I've been seriously concerned for the mental & emotional health of this community for a good while now. Its orientation, as far as I can tell, is to "not AI doom". Not a bright future. Not shared wholesomeness. Not healthy community. But "AI notkilleveryoneism".

I don't think you want to organize your creativity that way. Steering toward doom as an accidental result of focusing on it would be… really quite ironic and bad.

(And yes, I do believe we see evidence of exactly this pattern. Lots of people have noticed that quite a lot of AI risk mitigation efforts over the last two decades seem to have either (a) done nothing to timelines or (b) accelerated timelines. E.g. I think CFAR's main contribution to the space is arguably in its key role in inspiring Elon Musk to create OpenAI.)

My guess is most folk here would be happier if they picked a path they do want and aimed for that instead, now that they've spotted the danger they want to avoid. I bet we stand a much better chance of building a good future if we aim for one, as opposed to focusing entirely on not hitting the doom tree.

If we get to 2028 and there isn't yet such a shared vision, I think it'd be quite good to start talking about it. What future do we want to see? What might AI going well actually look like, for instance? Or what if AI stalls out for a long time, but we still end up with a wholesome future? What's that like? What might steps in that direction look like?

I think we need stuff like this to be whole, together.

 

Maybe it'll be okay

In particular, I think faith in humanity as a whole needs to be thinkable.

Yes, most people are dumber than the average Lesswronger. Yes, stupidity has consequences that smart people can often foresee. Yes, maybe humanity is too dumb not to shoot itself in the foot with a bazooka.

But maybe we've got this.

Maybe we're all in this together, and on some level that matters, we all know it.

I'm not saying that definitely is the case. I'm saying it could be. And that possibility seems worth taking to heart.

I'm reminded of a time when I was talking with a "normie" facilitator at a Circling retreat. I think this was 2015. I was trying to explain how humanity seemed to be ignoring its real problems, and how I was at that retreat trying to become more effective at doing something about it all.

I don't remember his exact words, but the sentiment I remember was something like:

I don't understand everything you're saying. But you seem upset, man. Can I give you a hug?

I didn't think that mattered, but I like hugs, so I said yes.

And I started crying.

I think he was picking up on a level I just wasn't tracking. Sure, my ideas were sensible and well thought out. But underneath all that I was just upset. He noticed that undercurrent and spoke to and met that part, directly.

He didn't have any insights about how we might solve existential risk. I don't know if he even cared about understanding the problem. I didn't walk away being more efficient at creating good AI alignment researchers.

But I felt better, and met, and cared for, and connected.

I think that matters a lot.

I suspect there's a lot going on like this. That at least some of the historical mainstream shrugging around AI has been because there's some other level that also deeply matters that's of more central focus to "normies" than to rationalists.

I think it needs to be thinkable that the situation is not "AI risk community vs. army of ignorant normie NPCs". Instead it might be more like, there's one form of immense brilliance in spaces like Less Wrong. And what we're all doing, throughout the human race, is figuring out how to interface different forms of brilliance such that we can effectively care for what's in our shared interests. We're all doing it. It just looks really different across communities, because we're all attending to different things and therefore reach out to each other in very different ways. And that's actually a really good thing.

My guess is that it helps a lot when communities meet each other with an attitude of 

We're same-sided here. We're in this together. That doesn't mean we yet know how to get along in each other's terms. But if it's important, we'll figure it out, even if "figure it out" doesn't look like what either of us expect at the start. We'll have to learn new ways of relating. But we can get there.

Come 2028, I hope Less Wrong can seriously consider for instance retiring terms like "NPC" and "normie", and instead adopt a more humble and cooperative attitude toward the rest of the human race. Maybe our fellow human beings care too. Maybe they're even paying vivid attention. It just might look different than what we're used to recognizing in ourselves and in those most like us.

And maybe also consider that even if we don't yet see how, and even if the transition is pretty rough at times, it all might turn out just fine. We don't know that it will. I don't mean to assert that it will. I mean, let's sincerely hold and attend to the possibility that it could. Maybe it'll all be okay.

 

Come 2028…

I want to reiterate that I don't mean what's going on right now is wrong and needs to stop. Like I said, I preordered If Anyone Builds It, Everyone Dies. I don't personally feel the need to become more familiar with those arguments or to have new ones. And I'm skeptical about the overall approach. But it seems like a really good push within this strategy, and if it makes things turn out well, then I'd be super happy to be wrong here. I support the effort.

But we now have this plausible timeline spelled out. And by January 2028 we'll have a reasonably good sense of how much it got right, and wrong.

…with some complication. It's one of those predictions that interacts with what it's predicting. So if AI 2027 doesn't pan out, one could argue it's because it might have but it changed because the prediction went viral. And therefore we should keep pushing the same strategy as before, because maybe now it's finally working!

But I'm hoping for a few things here.

One is, maybe we can find a way to make these dire predictions less unfalsifiable. Not in general, but specifically AI 2027. What differences should we expect to see if (a) the predictions were distorted due to the trauma mechanism I describe in this post vs. (b) the act of making the predictions caused them not to come about? What other plausible outcomes are there come 2028, and what do we expect sensible updates to look like at that point?

Another hope I have is that the trauma projection thing can be considered seriously. Not necessarily acted on just yet. That could be distracting. But it's worth recognizing that if the trauma thing is really a dominant force in AI doomerism spaces, then when we get to January 2028 we might not have hit AI doom but it's going to seem like there are still lots of reasons to keep doing basically the same thing as before. How can we anticipate this reaction, distinguish it from other outcomes, and appropriately declare an HMC event if and when it happens?

So, this post is my attempt at kind of a collective emotional stop loss order.

I kind of hope it turns out to be moot. Because in the world where it's needed, that's yet another 2.5 years of terror and pain that we might have skipped if we could have been convinced a bit sooner.

But being convinced isn't an idle point. It matters that maybe nothing like what I'm naming in this post is going on. There needs to be a high-integrity way of checking what's true here first.

I'm hoping I've put forward a good compromise.

Let's discuss for now, and then check in about it in 31 months.

Beisutsukai
Mentioned in
41A regime-change power-vacuum conjecture about group belief
26AI #123: Moratorium Moratorium