One winter a grasshopper, starving and frail, approaches a colony of ants drying out their grain in the sun to ask for food, having spent the summer singing and dancing.

Then, various things happen.

Customize

450Welcome to LessWrong!

Ruby, Raemon, RobertM, habryka

205

METR: Measuring AI Ability to Complete Long Tasks

Zach Stein-Perlman

13h

277

Good Research Takes are Not Sufficient for Good Strategic Takes

Neel Nanda

394AI 2027: What Superintelligence Looks Like

Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V, romeo

322LessWrong has been acquired by EA

habryka

264VDT: a solution to decision theory

L Rudolf L

126Why Have Sentence Lengths Decreased?

Arjun Panickssery

317Will Jesus Christ return in an election year?

Eric Neyman

12d

315Policy for LLM Writing on LessWrong

jimrandomh

12d

567How to Make Superbabies

GeneSmith, kman

1mo

327

229Tracing the Thoughts of a Large Language Model

Adam Jermyn

349A Bear Case: My Predictions Regarding AI Progress

Thane Ruthenis

1mo

150

224Recent AI model progress feels mostly like bullshit

12d

400How AI Takeover Might Happen in 2 Years

joshc

1mo

134

74How To Believe False Things

Eneasz

Quick Takes

NormanPerlmutter7h189

Things are getting scary with the Trump regime. Rule of law is breaking down with regard to immigration enforcement and basic human rights are not being honored.

I'm kind of dumbfounded because this is worse than I expected things to get. Do any of you LessWrongers have a sense of whether these stories are exaggerated or if they can be taken at face value?

Deporting immigrants is nothing new, but I don't think previous administrations have committed these sorts of human rights violations and due process violations.

Krome detention center in Miami -- overcrowded and possibly without access to sufficient drinking water

https://www.miamiherald.com/news/local/immigration/article303485356.html
https://www.yahoo.com/news/unpacking-claims-ice-holding-4k-220400460.html

https://www.instagram.com/jaxxchismetalk/reel/DHjxaXzAddP/

https://www.instagram.com/catpowerofficial/reel/DHhsiMvJ8BT/people-are-dying-under-ice-detainment-in-miamiand-this-video-is-from-last-weekpl/

Canadian in the US legally to apply for a work visa detained 2 weeks by ICE without due process

https://www.theguardian.com/us-news/2025/mar/19/canadian-detained-us-immigration-jasmine-mooney

2trevor27m

Although the risk of frogboiling human rights abuses won't go away anytime soon, it's also important to keep in mind that Trump got popular by doing whatever makes the left condemn him because right-wingers seem to interpret that as a costly credible signal of commitment to them/the right/opposing the left, and his administration has spent a decade following this strategy as consistently as can reasonably be considered possible for a sitting president, most of the time landing on strategies to provoke condemnation from liberals in non-costly or ambiguously costly ways (see Jan 6th). See Scott Alexander's classic post It's Bad On Purpose To Make You Click; engagement bait has been the soul of Trump's political persona since it emerged in the mid-2010s, and it will be interesting going forward to see whether the recent tariff designs will end up as serious policy and be added as a new centerpiece of the tax and spending regime (which had taken a stable form since the Vietnam War and the end of the Gold Standard[1]). 1. ^ The case could also be made that the computerization of Wall Street during the late 70s and 80s transformed the economy sufficiently radically that the current tax and spending paradigm could be condered more like 30-40 years old, or you could pin it to the 1950s when the tariff paradigm ended; either way, the modern emergence of massive recessions, predictive analytics, pandemic risk, and international military emphasis on trade geopolitics, all indicate potential for elite consensus around unusually large macroeconomic paradigm shifts.

NormanPerlmutter7h189

Things are getting scary with the Trump regime. Rule of law is breaking down with regard to immigration enforcement and basic human rights are not being honored. I'm kind of dumbfounded because this is worse than I expected things to get. Do any of you LessWrongers have a sense of whether these stories are exaggerated or if they can be taken at face value? Deporting immigrants is nothing new, but I don't think previous administrations have committed these sorts of human rights violations and due process violations. Krome detention center in Miami -- overcrowded and possibly without access to sufficient drinking water https://www.miamiherald.com/news/local/immigration/article303485356.html https://www.yahoo.com/news/unpacking-claims-ice-holding-4k-220400460.html https://www.instagram.com/jaxxchismetalk/reel/DHjxaXzAddP/ https://www.instagram.com/catpowerofficial/reel/DHhsiMvJ8BT/people-are-dying-under-ice-detainment-in-miamiand-this-video-is-from-last-weekpl/ Canadian in the US legally to apply for a work visa detained 2 weeks by ICE without due process https://www.theguardian.com/us-news/2025/mar/19/canadian-detained-us-immigration-jasmine-mooney

Ben Pace11h*200

I occasionally get texts from journalists asking to interview me about things around the aspiring rationalist scene. A few notes on my thinking and protocols for this:

I generally think it is pro-social to share information with serious journalists on topics of clear public interest.
By-default I speak with them only if their work seems relatively high-integrity. I like journalists whose writing is (a) factually accurate, (b) boring, and (c) do not feel to me to have an undercurrent of hatred for their subjects.
By default I speak with them off-the-record, and then offer to send them write-ups of the things I said that they want to quote. This has gone quite well. I've felt comfortable speaking in my usual fashion without worrying about nailing each and every phrasing. Then I ask what they're interested in quoting, and I send them (typically a 1-2 page) google doc on those topics (largely re-stating what I already said to them, and making some improvements / additions). Then they tell me which quotes they want to use (typically cutting many sentences or paragraphs half-way). Then I make one or two slight edits and give them explicit permission to quote. I think this has gone quite well and they've felt my quotes were substantive and improvements.
For the New York Times, I am currently trying out the policy of "I am happy to chat off-the-record. I will also offer quotes by my usual protocol, but I will only give them conditional on you including a mention that I disapprove of the NYT's de-anonymization policies (which I bring up due to your reckless and negligent behavior that upturned the life of a beloved member of my community)." I am about to try this for the first time, and I expect they will thus not want to use my quotes, and that's fine by me.

2Yoav Ravid39m

The quoting policy seems very good and clever :)

Ben Pace11h*200

I occasionally get texts from journalists asking to interview me about things around the aspiring rationalist scene. A few notes on my thinking and protocols for this: * I generally think it is pro-social to share information with serious journalists on topics of clear public interest. * By-default I speak with them only if their work seems relatively high-integrity. I like journalists whose writing is (a) factually accurate, (b) boring, and (c) do not feel to me to have an undercurrent of hatred for their subjects. * By default I speak with them off-the-record, and then offer to send them write-ups of the things I said that they want to quote. This has gone quite well. I've felt comfortable speaking in my usual fashion without worrying about nailing each and every phrasing. Then I ask what they're interested in quoting, and I send them (typically a 1-2 page) google doc on those topics (largely re-stating what I already said to them, and making some improvements / additions). Then they tell me which quotes they want to use (typically cutting many sentences or paragraphs half-way). Then I make one or two slight edits and give them explicit permission to quote. I think this has gone quite well and they've felt my quotes were substantive and improvements. * For the New York Times, I am currently trying out the policy of "I am happy to chat off-the-record. I will also offer quotes by my usual protocol, but I will only give them conditional on you including a mention that I disapprove of the NYT's de-anonymization policies (which I bring up due to your reckless and negligent behavior that upturned the life of a beloved member of my community)." I am about to try this for the first time, and I expect they will thus not want to use my quotes, and that's fine by me.

LoganStrohl13h213

here is how to cast a spell. (quick writeup of a talk i gave at DunCon)

Short Version
1. have an intention for change.
2. take a symbolic action representing the change.

that's it. that's all you need. but also, if you want to do it more:

Long Version

Prep

1. get your guts/intuition/S1/unconscious in touch with your intention somehow, and work through any tangles you find.

2. choose a symbolic action representing (and ideally participating in) the change you want, and design your spell around that. to find the right action, try this thought experiment: one day you wake up, and find you're living in the future where you've completely succeeded at getting the thing you want. you stretch and open your eyes. what's the very first thing that tips you off that something's different?

3. when it's time, lay out your materials and do any other physical/logistical prep work.

The Spell Itself

4. dedicate a sacred space (aka "cast a circle"). for example, carry incense around the space three times, or pour a circle of oats, or just move stuff out of the way. maybe ask something that vibes with protection and/or sacredness for you to help (such as a nearby river, or the memory of your grandfather).

5. raise energy. this means, do something that requires effort and focus and that invites your unconscious to the party while getting it engaged with the materials of your spell. maybe chant something relevant, or hold your hands over the magic herbs while pretending you can pour your intention into them. this is not the symbolic action; it just prepares you for the symbolic action.

6. release energy (that is, cast the spell). take the symbolic action you've planned. cut the string, drink the potion, burn the paper, whatever.

Conclude

7. seal the spell. do something that means "i'm now on the other side, in the post-spell world". the wiccans say "so mote it be". the christians say "amen". if your spell involved a jar, you could literally seal it with wax. i once did a spell involving a pair of earrings, which i sealed by putting the earrings on.

8. ground. "return excess energy to the earth." do something that gradually brings you back to a more ordinary state of mind. maybe imagine roots reaching down from your body deep into the earth and pretend you can equalize with the ground by sending your breath down and up those roots. or shake yourself like a wet dog. or eat a piece of cake.

9. release the space. thank whatever helped you cast the circle, then take an action that's the reverse of whatever you did to dedicate the space. walk the circle backward, or put all the furniture back.

Reinforce

10. choose at least one concrete action to take in line with your intention, and take that action. bonus points if it's a repeated action, like "every sunday i'll call my mom"

now, obviously 10 is the physical mechanism by which a change might happen in the physical world. but i recommend trying to answer for yourself, "how could including 1-9 possibly be superior in some way to jumping straight to 10?"

Hypnotic Framing

one more model of spellcasting: a spell is a self-hypnosis script.

standard hypnosis has 6 parts:
1. pre-talk
2. an induction
3. a deepener (sometimes)
4. a suggestion(s) (pre- or post-hypnotic)
5. an awakener
6. following a post-hypnotic suggestion if one was given

In the Long Version above,

1-3 is pre-talk,
4 is an induction
5 is a deepener
6 and 7 are a post-hypnotic suggestion
8 and 9 are awakeners
10 is following the post-hypnotic suggestion.

5Garrett Baker9h

It seems reasonable to mention that I know of many who have started doing "spells" like this, with a rationalized "oh I'm just hypnotizing myself, I don't actually believe in magic" framing who then start to go off the deep-end and start actually believing in magic. That's not to say this happens in every case or even in most cases. Its also not to say that hypnotizing yourself can't be useful sometimes. But it is to say that if you find this tempting to do because you really like the idea of magic existing in real life, I suggest you re-read some parts of the sequences.

2Garrett Baker9h

(you also may want to look into other ways of improving your conscientiousness if you're struggling with that. Things like todo systems, or daily planners, or simply regularly trying hard things)

3LoganStrohl13h

btw, although i've read a lot of witchy books at this point, this specific framework is most heavily influenced by the book "Spellcrafting" by Arin Murphy-Hiscock.

2trevor8h

An aspect where I expect further work to pay off is stuff related to self-visualization, which is fairly powerful (e.g. visualizing yourself doing something for 10 hours will generally go a really long way to getting you there, and for the 10 hour thing it's more a question of what to do when something goes wrong enough to make the actul events sufficiently different from what you imagined, and how to do it in less than 10 hours).

LoganStrohl13h213

here is how to cast a spell. (quick writeup of a talk i gave at DunCon) Short Version 1. have an intention for change. 2. take a symbolic action representing the change. that's it. that's all you need. but also, if you want to do it more: Long Version Prep 1. get your guts/intuition/S1/unconscious in touch with your intention somehow, and work through any tangles you find. 2. choose a symbolic action representing (and ideally participating in) the change you want, and design your spell around that. to find the right action, try this thought experiment: one day you wake up, and find you're living in the future where you've completely succeeded at getting the thing you want. you stretch and open your eyes. what's the very first thing that tips you off that something's different? 3. when it's time, lay out your materials and do any other physical/logistical prep work. The Spell Itself 4. dedicate a sacred space (aka "cast a circle"). for example, carry incense around the space three times, or pour a circle of oats, or just move stuff out of the way. maybe ask something that vibes with protection and/or sacredness for you to help (such as a nearby river, or the memory of your grandfather). 5. raise energy. this means, do something that requires effort and focus and that invites your unconscious to the party while getting it engaged with the materials of your spell. maybe chant something relevant, or hold your hands over the magic herbs while pretending you can pour your intention into them. this is not the symbolic action; it just prepares you for the symbolic action. 6. release energy (that is, cast the spell). take the symbolic action you've planned. cut the string, drink the potion, burn the paper, whatever. Conclude 7. seal the spell. do something that means "i'm now on the other side, in the post-spell world". the wiccans say "so mote it be". the christians say "amen". if your spell involved a jar, you could literally seal it with wax. i once did a spe

leogao2d497

every 4 years, the US has the opportunity to completely pivot its entire policy stance on a dime. this is more politically costly to do if you're a long-lasting autocratic leader, because it is embarrassing to contradict your previous policies. I wonder how much of a competitive advantage this is.

Garrett Baker19h130

Autarchies, including China, seem more likely to reconfigure their entire economic and social systems overnight than democracies like the US, so this seems false.

4leogao17h

It's often very costly to do so - for example, ending the zero covid policy was very politically costly even though it was the right thing to do. Also, most major reconfigurations even for autocratic countries probably mostly happen right after there is a transition of power (for China, Mao is kind of an exception, but thats because he had so much power that it was impossible to challenge his authority even when he messed up).

2Garrett Baker16h

The closing off of China after/during Tinamen square I don't think happened after a transition of power, though I could be mis-remembering. See also the one-child policy, which I also don't think happened during a power transition (allowed for 2 children in 2015, then removed all limits in 2021, while Xi came to power in 2012). I agree the zero-covid policy change ended up being slow. I don't know why it was slow though, I know a popular narrative is that the regime didn't want to lose face, but one fact about China is the reason why many decisions are made is highly obscured. It seems entirely possible to me there were groups (possibly consisting of Xi himself) who believed zero-covid was smart. I don't know much about this though. I will also say this is one example of china being abnormally slow of many examples of them being abnormally fast, and I think the abnormally fast examples win out overall. Ish? The reason he pursued the cultural revolution was because people were starting to question his power, after the great leap forward, but yeah he could be an outlier. I do think that many autocracies are governed by charismatic & powerful leaders though, so not that much an outlier.

8leogao11h

I mean, the proximate cause of the 1989 protests was the death of the quite reformist general secretary Hu Yaobang. The new general secretary, Zhao Ziyang, was very sympathetic towards the protesters and wanted to negotiate with them, but then he lost a power struggle against Li Peng and Deng Xiaoping (who was in semi retirement but still held onto control of the military). Immediately afterwards, he was removed as general secretary and martial law was declared, leading to the massacre.

interstice21h137

Or disadvantage, because it makes it harder to make long-term plans and commitments?

2Ben14h

Having unstable policy making comes with a lot of disadvantages as well as advantages. For example, imagine a small poor country somewhere with much of the population living in poverty. Oil is discovered, and a giant multinational approaches the government to seek permission to get the oil. The government offers some kind of deal - tax rates, etc. - but the company still isn't sure. What if the country's other political party gets in at the next election? If that happened the oil company might have just sunk a lot of money into refinery's and roads and drills only to see them all taken away by the new government as part of its mission to "make the multinationals pay their share for our people." Who knows how much they might take? What can the multinational company do to protect itself? One answer is to try and find a different country where the opposition parties don't seem likely to do that. However, its even better to find a dictatorship to work with. If people think a government might turn on a dime, then they won't enter into certain types of deal with it. Not just companies, but also other countries. So, whenever a government does turn on a dime, it is gaining some amount of reputation for unpredictability/instability, which isn't a good reputation to have when trying to make agreements in the future.

leogao2d497

Stephen McAleese1h20

A recent essay called "Keep the Future Human" made a compelling case for avoiding building AGI in the near future and building tool AI instead.

The main point of the essay is that AGI is the intersection of three key capabilities:

High autonomy
High generality
High intelligence

It argues that these three capabilities are dangerous when combined together and pose unacceptable risks to the job market and culture of humanity and would replace rather than augment humans. Instead of building AGI, the essay recommends building powerful but controllable tool-like AI that has only one or two of the three capabilities. For example:

Driverless cars are autonomous and intelligent but not general.
AlphaFold is intelligent but not autonomous or general.
GPT-4 is intelligent and general but not autonomous.

It also recommends compute limits to limit the overall capability of AIs.

Link: https://keepthefuturehuman.ai/

Stephen McAleese1h20

A recent essay called "Keep the Future Human" made a compelling case for avoiding building AGI in the near future and building tool AI instead. The main point of the essay is that AGI is the intersection of three key capabilities: * High autonomy * High generality * High intelligence It argues that these three capabilities are dangerous when combined together and pose unacceptable risks to the job market and culture of humanity and would replace rather than augment humans. Instead of building AGI, the essay recommends building powerful but controllable tool-like AI that has only one or two of the three capabilities. For example: * Driverless cars are autonomous and intelligent but not general. * AlphaFold is intelligent but not autonomous or general. * GPT-4 is intelligent and general but not autonomous. It also recommends compute limits to limit the overall capability of AIs. Link: https://keepthefuturehuman.ai/

Popular Comments

lc4d11934

LessWrong has been acquired by EA

As a newly minted +100 strong upvote, I think the current karma economy accurately reflects how my opinion should be weighted

Vladimir_Nesov2d464

AI 2027: What Superintelligence Looks Like

Non-Google models of late 2027 use Nvidia Rubin, but not yet Rubin Ultra. Rubin NVL144 racks have the same number of compute dies and chips as Blackwell NVL72 racks (change in the name is purely a marketing thing, they now count dies instead of chips). The compute dies are already almost reticle sized, can't get bigger, but Rubin uses 3nm (~180M Tr/mm2) while Blackwell is 4nm (~130M Tr/mm2). So the number of transistors per rack goes up according to transistor density between 4nm and 3nm, by 1.4x, plus better energy efficiency enables higher clock speed, maybe another 1.4x, for the total of 2x in performance. The GTC 2025 announcement claimed 3.3x improvement for dense FP8, but based on the above argument it should still be only about 2x for the more transistor-hungry BF16 (comparing Blackwell and Rubin racks). Abilene site of Stargate[1] will probably have 400K-500K Blackwell chips in 2026, about 1 GW. Nvidia roadmap puts Rubin (VR200 NVL144) 1.5-2 years after Blackwell (GB200 NVL72), which is not yet in widespread use, but will get there soon. So the first models will start being trained on Rubin no earlier than late 2026, much more likely only in 2027, possibly even second half of 2027. Before that, it's all Blackwell, and if it's only 1 GW Blackwell training systems[2] in 2026 for one AI company, shortly before 2x better Rubin comes out, then that's the scale where Blackwell stops, awaiting Rubin and 2027. Which will only be built at scale a bit later still, similarly to how it's only 100K chips in GB200 NVL72 racks in 2025 for what might be intended to be a single training system, and not yet 500K chips. This predicts at most 1e28 BF16 FLOPs (2e28 FP8 FLOPs) models in late 2026 (trained on 2 GW of GB200/GB300 NVL72), and very unlikely more than 1e28-4e28 BF16 FLOPs models in late 2027 (1-4 GW Rubin datacenters in late 2026 to early 2027), though that's alternatively 3e28-1e29 FP8 FLOPs given the FP8/BF16 performance ratio change with Rubin I'm expecting. Rubin Ultra is another big step ~1 year after Rubin, with 2x more compute dies per chip and 2x more chips per rack, so it's a reason to plan pacing the scaling a bit rather than rushing it in 2026-2027. Such plans will make rushing it more difficult if there is suddenly a reason to do so, and 4 GW with non-Ultra Rubin seems a bit sudden. So pretty similar to Agent 2 and Agent 4 at some points, keeping to the highest estimates, but with less compute than the plot suggests for months while the next generation of datacenters is being constructed (during the late 2026 to early 2027 Blackwell-Rubin gap). ---------------------------------------- 1. It wasn't confirmed all of it goes to Stargate, only that Crusoe is building it on the same site as it did the first buildings that do go to Stargate. ↩︎ 2. 500K chips, 1M compute dies, 1.25M H100-equivalents, ~4e27 FLOPs for a model in BF16. ↩︎

Aella3d76-29

Consider showering

Strong disagree. This is an ineffective way to create boredom. Showers are overly stimulating, with horrible changes in temperature, the sensation of water assaulting you nonstop, and requiring laborious motions to do the bare minimum of scrubbing required to make society not mad at you. A much better way to be bored is to go on a walk outside or lift weights at the gym or listen to me talk about my data cleaning issues

Recent Discussion

abstractapplic's Shortform

abstractapplic

7mo

2abstractapplic12h

I recently watched (the 1997 movie version of) Twelve Angry Men, and found it fascinating from a Bayesian / confusion-noticing perspective. My (spoilery) notes (cw death, suspicion, violence etc): From all the above, I conclude: and However if and only if (trigger warning for detailed discussion of that thing I just mentioned) I'm curious what other LW users think.

abstractapplic12h20

One last, even more speculative thought:

Literally everything the racist juror does in the back half of the movie is weird and suspicious. It's strange that he expects people to be convinced by his bigoted tirade; it's also strangely convenient that he's willing to vote not guilty by the end even though he A) hasn't changed his mind and B) knows a hung jury would probably eventually lead to the death of the accused, which he wants.

I don't think it's likely, but I'd put maybe a ~1% probability on . . .

. . . him being in league with the protagonist, and them running a two-man con on the other ten jurors to get the unanimous verdict they want.

NormanPerlmutter's Shortform

NormanPerlmutter

18NormanPerlmutter7h

trevor27m20

Benito's Shortform Feed

Ben Pace

Ω 57y

The comments here are a storage of not-posts and not-ideas that I would rather write down than not.

20Ben Pace11h

Yoav Ravid39m20

The quoting policy seems very good and clever :)

How much progress actually happens in theoretical physics?

ChristianKl

12h

I frequently hear people make the claim that progress in theoretically physics is stalled, partly because all the focus is on String theory and String theory doesn't seem to pan out into real advances.

Believing it fits my existing biases, but I notice that I lack the physics understanding to really know whether or not there's progress. What do you think?

Mitchell_Porter42m20

As Adam Scherlis implies, the standard model turns out to be very effective at all the scales we can reach. There are a handful of phenomena that go beyond it - neutrino masses, "dark matter", "dark energy" - but they are weak effects that offer scanty clues as to what exactly is behind them.

On the theoretical side, we actually have more models of possible new physics than ever before in history, the result of 50 years of work since the standard model came together. A lot of that is part of a synthesis that includes the string theory paradigm, but th... (read more)

3Charlie Steiner3h

A lot of the effect is picking high-hanging fruit. Like, go to phys rev D now. There's clearly a lot of hard work still going on. But that hard work seems to be getting less result, because they're doing things like carefully calculating the trailing-order terms of the muon's magnetic moment to get a change many decimal places down. (It turns out that this might be important for studying physics beyond the Standard Model. So this is good and useful work, definitely not being literally stalled.) Another chunk of the effect is that you generally don't know what's important now. In hindsight you can look back and see all these important bits of progress woven into a sensible narrative. But research that's being done right now hasn't had time to earn its place in such a narrative. Especially if you're an outside observer who has to get the narrative of research third-hand.

4Adam Scherlis5h

The main thing I'd add is that string theory is not the problem. All the experimental low hanging fruit was picked decades ago. There are very general considerations that suggest that any theory of quantum gravity, stringy or otherwise, will only really be testable at the Planck scale. What this means in practice is that theoretical high-energy physics doesn't get to check its answers anymore. I think there's still progress, and still hope for new empirical results (especially in astrophysics and cosmology), but it's much harder without a barrage of unexplained observations.

2Garrett Baker4h

This doesn’t seem to address the question, which was why do people believe there is a physics slow-down in the first place.

AI 2027: What Superintelligence Looks Like

394

Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V, romeo

This is a linkpost for https://ai-2027.com/

In 2021 I wrote what became my most popular blog post: What 2026 Looks Like. I intended to keep writing predictions all the way to AGI and beyond, but chickened out and just published up till 2026.

Well, it's finally time. I'm back, and this time I have a team with me: the AI Futures Project. We've written a concrete scenario of what we think the future of AI will look like. We are highly uncertain, of course, but we hope this story will rhyme with reality enough to help us all prepare for what's ahead.

You really should go read it on the website instead of here, it's much better. There's a sliding dashboard that updates the stats as you scroll through the scenario!

But I've nevertheless copied the...

(Continue Reading – 12069 more words)

hive1h10

There is no infinite growth in nature. Everything will hit a ceiling at some point. So I agree that the intelligence explosion will eventually take a sigmoid shape as it approaches the physical limits. However I think the physical limits are far of. While we will get diminishing returns for each individual technology, we will also shift to a new technology each time. It might slow down when the Earth has been transformed into a super computer, as interplanetary communication naturally slows down processing speed. But my guess is that this will happen long after the scenario described here.

2Leon Lang2h

I worked on geometric/equivariant deep learning a few years ago (with some success, leading to two ICLR papers and a patent, see my google scholar: https://scholar.google.com/citations?user=E3ae_sMAAAAJ&hl=en). The type of research I did was very reasoning-heavy. It’s architecture research in which you think hard about how to mathematically guarantee that your network obeys some symmetry constraints appropriate for a domain and data source. As a researcher in that area, you have a very strong incentive to claim that a special sauce is necessary for intelligence, since providing special sauces is all you do. As such, my prior is to believe that these researchers don’t have any interesting objection to continued scaling and “normal” algorithmic improvements to lead to AGI and then superintelligence. It might still be interesting to engage when the opportunity arises, but I wouldn’t put extra effort into making such a discussion happen.

2MondSemmel3h

Large institutions are super slow to change, and usually many years behind the technological frontier. It seems to me like the burden of proof is very obviously on your perspective. For instance, US policy only acted large-scale on Covid after we were already far along the exponential. That should be a dealbreaker for this being your dealbreaker. Also, there is no single entity called "the government"; individuals can be more or less aware of stuff, but that doesn't mean the larger entity acts with something resembling awareness. Or cohesion, for that matter.

2Mitchell_Porter3h

I don't recognize any of these names. I'm guessing they are academics who are not actually involved with any of the frontier AI efforts, and who think for various technical reasons that AGI is not imminent? edit: OK, I looked them up, Velickovic is at DeepMind, I didn't see a connection to "Big AI" for any of the others, but they are all doing work that might matter to the people building AGI. Nonetheless, if their position is that current AI paradigms are going to plateau at a level short of human intelligence, I'd like to see the argument. AIs can still make mistakes that are surprising to a human mind - e.g. in one of my first conversations with the mighty Gemini 2.5, it confidently told me that it was actually Claude Opus 3. (I was talking to it in Google AI Studio, where it seems to be cut off from some system resources that would make it more grounded in reality.) But AI capabilities can also be so shockingly good, that I wouldn't be surprised if they took over tomorrow.

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

AI CoT Reasoning Is Often Unfaithful

Zvi

21h

A new Anthropic paper reports that reasoning model chain of thought (CoT) is often unfaithful. They test on Claude Sonnet 3.7 and r1, I’d love to see someone try this on o3 as well.

Note that this does not have to be, and usually isn’t, something sinister.

It is simply that, as they say up front, the reasoning model is not accurately verbalizing its reasoning. The reasoning displayed often fails to match, report or reflect key elements of what is driving the final output. One could say the reasoning is often rationalized, or incomplete, or implicit, or opaque, or bullshit.

The important thing is that the reasoning is largely not taking place via the surface meaning of the words and logic expressed. You can’t look at the words and logic...

(Continue Reading – 1965 more words)

testingthewaters1h10

I mean, this applies to humans too. The words and explanations we use for our actions are often just post hoc rationalisations. An efficient text predictor must learn not what the literal words in front of them mean, but the implied scenario and thought process they mask, and that is a strictly nonlinear and "unfaithful" process.

11ryan_greenblatt13h

I think it would be a mistake to interpret this paper as a substantial update against large safety gains from inspecting CoT. This paper exposes unfaithfulness in cases where the non-visible reasoning is extremely minimal such that it can easily happen within a forward pass (e.g. a simple reward hack or an easy-to-notice hint). However, a lot of my hope for CoT faithfulness comes from cases where the reasoning is sufficiently difficult and serially deep that the model would have to reason in CoT for the reasoning to occur. This could be either the model reasoning through the basics of alignment faking / training gaming / scheming (e.g., if I comply in training then I won't be modified which means that in deployment...) or the model reasoning through how to strike (without getting caught) given that it is misalignment. Correspondingly, I think the biggest concern with CoT-based safety is models becoming much more capable of opaque reasoning which could be due to encoded reasoning (aka steganography), architectures which allow for opaque recurrence or similar (aka neuralese), or just much more powerful forward passes. (See also my commentary in this tweet thread. There is some related discussion in this openphil RFP under encoded reasoning.)

8Trevor Hill-Hand18h

I've actually noticed this in a hobby project, where I have some agents running around a little MOO-like text world and talking to each other. With DeepSeek-R1, just because it's fun to watch them "think" like little characters, I noticed I see this sort of thing a lot (maybe 1-in-5-ish, though there's a LOT of other scaffolding and stuff going on around it which could also be causing weird problems): <think> Alright I need to do this very specific thing "A" which I can see in my memories I've been trying to do for a while instead of thing B. I will do thing A, by giving the command "THING A". </think> THING B

Stephen McAleese's Shortform

Stephen McAleese

Stephen McAleese1h20

A recent essay called "Keep the Future Human" made a compelling case for avoiding building AGI in the near future and building tool AI instead.

The main point of the essay is that AGI is the intersection of three key capabilities:

High autonomy
High generality
High intelligence