Agent-foundations researcher. Working on Synthesizing Standalone World-Models, aiming at a timely technical solution to the AGI risk fit for worlds where alignment is punishingly hard and we only get one try.
Currently looking for additional funders ($1k+, details). Consider reaching out if you're interested, or donating directly.
Or get me to pay you money ($5-$100) by spotting holes in my agenda or providing other useful information.
Perhaps the claim is that such Python programs won't be encountered due to relevant properties of the universe (ie, because the universe is understandable).
That's indeed where some of the hope lies, yep!
Following up on [1] and [2]...
So, I've had a "Claude Code moment" recently: I decided to build something on a lark, asked Opus to implement it, found that the prototype worked fine on the first try, then kept blindly asking for more and more features and was surprised to discover that it just kept working.
The "something" in question was a Python file editor which behaves as follows:
The remarkable thing isn't really the functionality (to a large extent, this is just a wrapper on ast + QScintilla), but how little effort it took: <6 hours by wall-clock time to generate 4.3k lines of code, and I've never actually had to look at it, I just described the features I wanted and reported bugs to Opus. I've not verified the functionality comprehensively, but it basically works, I think.
How does that square with the frankly dismal performance I've been observing before? Is it perhaps because I skilled up at directing Opus, cracked the secret to it, and now I can indeed dramatically speed up my work?
No.
There was zero additional skill involved. I'd started doing it on a lark, so I'd disregarded all the previous lessons I've been learning and just directed Opus same as how I've been trying to do it at the start. And it Just Worked in a way it Just Didn't Work before.
Which means the main predictor of how well Opus performs isn't how well you're using it/working with it, but what type of project you're working on.
Meaning it's very likely that the people for whom LLMs works exhilaratingly well are working on the kinds of projects LLMs happen to be very good at, and everyone for whom working with LLMs is a tooth-pulling exercise happen not to be working on these kinds of projects. Or, to reframe: if you need to code up something from the latter category, if it's not a side-project you can take or leave, you're screwed, no amount of skill on your part is going to make it easy. The issue is not that of your skill.
The obvious question is: what are the differences between those categories? I have some vague guesses. To get a second opinion, I placed the Python editor ("SpanEditor") and the other project I've been working on ("Scaffold") into the same directory, and asked Opus to run a comparative analysis regarding their technical difficulty and speculate about the skillset of someone who'd be very good at the first kind of project and bad at the second kind. (I'm told this is what peak automation looks like.)
Its conclusions seem sensible:
Scaffold is harder in terms of:
SpanEditor is harder in terms of:
The fundamental difference: Scaffold builds infrastructure from primitives (graphics, commands, queries) while SpanEditor leverages existing infrastructure (Scintilla, AST) but must solve domain-specific semantic problems (code understanding).
[...]
Scaffold exhibits systems complexity - building infrastructure from primitives (graphics, commands, queries, serialization).
SpanEditor exhibits semantic complexity - leveraging existing infrastructure but solving domain-specific problems (understanding code without type information).
Both are well-architected. Which is "harder" depends on whether you value low-level systems programming or semantic/heuristic reasoning.
[...]
What SpanEditor-Style Work Requires
What Scaffold-Style Work Requires
The Cognitive Profile
Someone who excels at SpanEditor but struggles with Scaffold likely has these traits:
Strengths
| Trait | Manifestation |
| Strong verbal/symbolic reasoning | Comfortable with ASTs, grammars, semantic analysis |
| Good at classification | Naturally thinks "what kind of thing is this?" |
| Comfortable with ambiguity | Can write heuristics that work "most of the time" |
| Library-oriented thinking | First instinct: "what library solves this?" |
| Top-down decomposition | Breaks problems into conceptual categories |
Weaknesses
| Trait | Manifestation |
| Weak spatial reasoning | Struggles to visualize coordinate transformations |
| Difficulty with temporal interleaving | Gets confused when multiple state machines interact |
| Uncomfortable without guardrails | Anxious when there's no library to lean on |
| Single-layer focus | Tends to think about one abstraction level at a time |
| Stateless mental model | Prefers pure functions; mutable state across time feels slippery |
Deeper Interpretation
They Think in Types, Not States
SpanEditor reasoning: "A CodeElement can be a function, method, or class. A CallInfo has a receiver and a name."
Scaffold reasoning: "The window is currently in RESIZING_LEFT mode, the aura progress is 0.7, and there's a pending animation callback."
The SpanEditor developer asks "what is this?" The Scaffold developer asks "what is happening right now, and what happens next?"
They're Comfortable with Semantic Ambiguity, Not Mechanical Ambiguity
SpanEditor: "We can't know which class obj.method() refers to, so we'll try all classes." (Semantic uncertainty - they're fine with this.)
Scaffold: "If the user releases the mouse during phase 1 of the animation, do we cancel phase 2 or let it complete?" (Mechanical uncertainty - this feels overwhelming.)
They Trust Abstractions More Than They Build Them
SpanEditor developer's instinct: "Scintilla handles scrolling. I don't need to know how."
Scaffold requires: "I need to implement scrolling myself, which means tracking content height, visible height, scroll offset, thumb position, and wheel events."
The SpanEditor developer is a consumer of well-designed abstractions. The Scaffold developer must create them.
tl;dr: "they think in types, not states", "they're anxious when there's no library to lean on", "they trust abstractions more than they build them", and "tend to think about one abstraction level at a time".
Or, what I would claim is a fine distillation: "bad at novel problem-solving and gears-level modeling".
Now, it's a bit suspicious how well this confirms my cached prejudices. A paranoiac, which I am, might suspect the following line of possibility: I'm sure it was transparent to Opus that it wrote both codebases (I didn't tell it, but I didn't bother removing its comments, and I'm sure it can recognize its writing style), so perhaps when I asked it to list the strengths and weaknesses of that hypothetical person, it just retrieved some cached "what LLMs are good vs. bad at" spiel from its pretraining. There are reasons not to think that, though:
Overall... Well, make of that what you will.
The direction of my update, though, is once again in favor of LLMs being less capable than they sound like, and towards longer timelines.
Like, before this, there was a possibility that it really were a skill issue on my part, and one really could 10x their productivity with the right approach. But I've now observed that whether you get 0.8x'd or 10x'd depends on the project you're working on and doesn't depend on one's skill level – and if so, well, this pretty much explains the cluster of "this 10x'd my productivity!" reports, no? We no longer need to entertain the "maybe there really is a trick to it" hypothesis to explain said reports.
Anyway, this is obviously rather sparse data, and I'll keep trying to find ways to squeeze more performance out of LLMs. But, well, my short-term p(doom) has gone down some more.
I am interested in trying out the new code simplifier to see whether it can do a good job
Tried it out a couple times just now, it appears specialized for low-level, syntax-level rephrasings. It will inline functions and intermediate-variable computations that are only used once and try to distill if-else blocks into something more elegant, but it won't even attempt doing things at a higher level. Was very eager to remove Claude's own overly verbose/obvious comments, though. Very relatable.
Overall, it would be mildly useful in isolation, but I'm pretty sure you can get the same job done ten times faster using Haiku 4.5 or Composer-1 (Cursor's own blazing-fast LLM).
Curious if you get a different experience.
Eh, I don't think this really disproves the OP (who mentions bejeweled iPhones as well). This isn't really a "$100k phone", it's a normal $1k phone with $99k of jewelry tastelessly bolted on. You're not getting a 50,000 mAh battery, a gaming GPU, and Wi-Fi 12 plus 6G support in a 300 gram package here. Which, indeed: why not?
I've previously left a comment describing my being fairly unimpressed with Claude Code Opus 4.5, and leaving a few guesses regarding what causes the difference in people's opinion regarding its usefulness. Eight more days into it, I have new comments and new guesses.
tl;dr: Very useful (or maybe I'm also deluding myself), very hard to use (or maybe I have skill issues). If you want to build something genuinely complicated with it, it's probably worth it, but it will be an uphill battle against superhuman-speed codebase rot, and you will need significant technical expertise and/or the ability to learn that expertise quickly.
First, what I'm attempting is to use it to implement an app that's not really all that complicated, but which is still pretty involved, still runs fairly nontrivial logic at the backend (along the lines of this comment), and about whose functionality I have precise desiderata and very few vibes.
Is Opus 4.5 helpful and significantly speeding me up? Yes, of that there is no doubt. Asking it questions about the packages to use, what tools they offer, and how I could architecture solutions to various problems I run into, is incredibly helpful. Its answers are so much more precise than Google's, it distill information so much better than raw code documentation does, and it's both orders of magnitude faster than StackOverflow and is smarter than the median answer you'd get there.
Is Claude Code helpful and speeding me up? That is more of an open question. Some loose thoughts:
To sum my current view up: Seems useful, but hard to use. You'll have to fight it/the decay it spreads in its wake every step of the way, and making a misstep will give your codebase lethal cancer.
We'll see how I feel about it in one more week, I suppose.
Seconded. Went from a skeptical "big if true" at the post title to rolling my eyes once I saw "iruletheworldmo".
For reference, check out this leak by that guy from February 2025:
ok. i’m tired of holding back. some of labs are holding things back from you.
the acceleration curve is fucking vertical now. nobody's talking about how we just compressed 200 years of scientific progress into six months. every lab hitting capability jumps that would've been sci-fi last quarter. we're beyond mere benchmarks and into territory where intelligence is creating entirely new forms of intelligence.
watched a demo yesterday that casually solved protein folding while simultaneously developing metamaterials that shouldn't be physically possible. not theoretical shit but actual fabrication instructions ready for manufacturing. the researchers presenting it looked shell shocked. some were laughing uncontrollably while others sat in stunned silence. there's no roadmap for this level of cognitive explosion.
we've crossed into recursive intelligence territory and it's no longer possible to predict second order effects. forget mars terraforming or fusion. those are already solved problems just waiting for implementation. the real story is the complete collapse of every barrier between conceivable and achievable. the gap between imagination and reality just vanished while everyone was arguing about risk frameworks. intelligence has broken free of all theoretical constraints and holy fuck nobody is ready for what happens next week. reality itself is now negotiable.
I guess it's kind of entertainingly written.
It seems like, yes, he is saying that wealth levels get locked in by early investment choices, and then that it is ‘hard to justify’ high levels of ‘inequality’ and that even if you can make 10 million a year in real income in the post-abundance future Larry Page’s heirs owning galaxies is not okay.
I say, actually, yes that’s perfectly okay, provided there is stable political economy and we’ve solved the other concerns so you can enjoy that 10 million a year in peace.
I dunno about that. I think it is not okay for directionally the same reasons it wouldn't be okay if we got an "infinitesimally aligned" paperclip maximizer who leaves the Solar System alone but paperclips the rest of the universe: astronomical waste.
Like, suppose 99% of the universe ends up split between twenty people, with them using it as they please, in ways that don't generate much happiness for others. Arguably it's not going to be that bad even in the "tech-oligarch capture" future (because Dario Amodei has made a pledge to donate 10% of his earnings or whatever[1]), but let's use that to examine our intuitions.
One way to look at it is: this means the rest of civilization will only end up 1% as big as it could be. This argument may or may not feel motivating to you; I know "more people is better" is not a very visceral-feeling intuition.
Another way to look at it is: this means all the other people will only end up with 1% of the lifespan they could have had. Like, in the very long term, post-scarcity isn't real, the universe's resources are finite (as far as we currently know), and physical entities need to continuously consume those to keep living. If 99% of resources are captured by people who don't care to share them, everyone else will end up succumbing to the heat death much faster than in the counterfactual.
This is isomorphic to "the rich have their soldiers take all timber and leave the poor to freeze to death in the winter". The only reason it doesn't feel the same way is because it's hard to wrap your head around large numbers: surely you'd be okay with only living for 100 billion years, instead of 10 trillion years, right? In the here and now, both numbers just round up to "effectively forever". But no, once you actually get to the point of you and all your loved ones dying of negentropic starvation 100 billion years in, it would feel just as infuriatingly unfair.
I understand the tactical moves of "we have to pretend it's okay if the currently-powerful capture most of the value of the universe, so that they're more amicable to listening to our arguments for AI safety and don't get so scared of taxes they accelerate AI even further" and "we have to shut down the discussion of 'but which monkey gets the banana?' at every turn because it competes with the 'the banana is poisoned' messaging". But if we're keeping to Simulacrum Level 1, no, I do not in fact believe it's okay.
I also don't necessarily agree that those moves are pragmatically good. It's mostly pointless to keep talking to AI-industry insiders; if we're doing any rhetoric, it should focus on "outsiders". And if it is in fact true that the current default trajectory of worlds in which the AGI labs' plans succeed may lead to something like the above astronomical-waste scenarios, making those arguments to the general public is potentially a high-impact move. "Ban the AGI research because otherwise the rich will take all the stuff" is a much more memetically viral message than "ban the AGI because Terminator".
(To be clear, I'm not arguing we should join various coalitions making false arguments to that end, e. g. the datacenter water thing. But if there are true arguments of that form, as I believe there are...)
I do not trust that guy to keep such non-binding promises, by the way. His track record isn't good, what with "Anthropic won't advance the AI frontier".
I note that the wording in the more direct sources (rather than paraphrases) is "preventive war" and "bomb them", which doesn't actually strictly imply preventive nuclear bombings. It's plausible that "bomb them" and "war with the USSR" could only mean "nuclear war" in-context... But it'd also be really funny if this is another "Eliezer advocates nuking foreign datacenters" situation.
Ryan had suggested that, on his model, spending ~5%-more-than-commercially-expedient resources on alignment might drop takeover risks down to 50%. I'm interested in how he thinks this scales: how much more resources, in percentage terms, would be needed to drop the risk to 20%, 10%, 1%?