important detail from AI warmth experimental design: "to assess whether optimizing for warmth specifically causes the effect, we fine-tune a subset of models in the opposite direction—toward a colder, less empathetic style—and observe stable and sometimes improved reliability."
Reddit’s r/Psychiatry is asked, are we seeing ‘AI psychosis’ in practice?
I like the proposed name "Folie à deux as a service".
If you can’t enthusiastically endorse that outcome, were it to happen, then you should be yelling at us to stop.
I don't think it's that simple. I'm not enthusiastic about transhumanism, so I can't enthusiastically endorse that outcome, but I can't bring myself to say, "Don't build AI because it'll make transhumanism possible sooner." If anything, I expect that having a friendly-to-everyone ASI would make it a lot easier to transition into a world where some people are Jupiter-brained.
I am quite willing to say, "Don't build AI until you can make sure it's friendly-to-everyone," of course.
Does "prefer this outcome to extinction or disempowerment, though I wish there were more pleasant current-human futures available" count as "enthusiastically endorse"?
Not the way I'd use those words, nope. The first is a low bar; the second is extremely high, and includes a specific emotional reaction in it. I haven't seen any plausible vision of 2040 that I'd enthusiastically endorse, whether it's business-as-usual or dismantling stars, but it's not hard to come up with futures that are preferable to the end of love in the universe.
Brace for impact. We are presumably (checks watch) four hours from GPT-5.
That’s the time you need to catch up on all the other AI news.
In another week, I might have done an entire post on Gemini 2.5 Deep Thinking, or Genie 3, or a few other things. This week? Quickly, there’s no time.
OpenAI has already released an open model. I’m aiming to cover that tomorrow.
Table of Contents
Also: Claude 4.1 is an incremental improvement, On Altman’s Interview With Theo Von.
Language Models Offer Mundane Utility
The Prime Minister of Sweden asks ChatGPT for job advice ‘quite often.’
Claude excels in cybersecurity competitions based on tests they’ve run over the past year so this was before Opus 4.1 and mostly before Opus 4.
Not wanting to spend the time interviewing candidates is likely a mistake, but there are other time sinks involved as well, and there are good reasons to want to keep your operation at size one rather than two. It can still be a dangerous long term trap to decide to do all the things yourself, especially if it accumulates state that will get harder and harder to pass off. I would not recommend.
Language Models Don’t Offer Mundane Utility
Anthropic has cut off OpenAI employees from accessing Claude.
That, and OpenAI rather clearly violated Anthropic’s terms of service?
As in, they used it to build and train GPT-5, which they are not allowed to do.
OpenAI called their use ‘industry standard.’ I suppose they are right that it is industry standard to disregard the terms of service and use competitors AI models to train your own.
A thread asking what people actually do with ChatGPT Agent. The overall verdict seems to be glorified tech demo not worth using. That was my conclusion so far as well, that it wasn’t valuable enough to overcome its restrictions, especially the need to enter passwords. I’ll wait for GPT-5 and reassess then.
Huh, Upgrades
Veo 3 Fast and Veo 3 image-to-video join the API. Veo 3 Fast is $0.40 per second of video including audio. That is cheap enough for legit creators, but it is real money if you are aimlessly messing around.
I do have access, so I will run my worthy queries there in parallel and see how it goes.
I notice the decision to use Grok 4 as a comparison point rather than Opus 4. Curious.
ChatGPT Operator is being shut down and merged into Agent. Seems right. I don’t know why they can’t migrate your logs but if you have Operator chats you want to save do it by August 31.
Jules can now open pull requests.
Claude is now available for purchase by federal government departments through the General Services Administration, contact link is here.
Claude Code shipped automated security reviews via /security-review, with GitHub integration, checking for things like SQL injection risks and XSS vulnerabilities. If you find an issue you can ask Clade Code to fix it, and they report they’re using this functionality internally at Anthropic.
OpenAI is giving ChatGPT access to the entire federal workforce for $1 total. Smart.
College students get the Gemini Pro plan free for a year, including in the USA.
On Your Marks
Psyho shares thoughts about the AWTF finals, in which they took first place ahead of OpenAI.
The longer explanation is a good read. The biggest weakness of OpenAI’s agent was that it was prematurely myopic. It maximized short term score long before it was nearing the end of the contest, rather than trying to make conceptual advances, and let its code become bloated and complex, mostly wasting the second half of its time.
As a game and contest enjoyer, I know how important it is to know when to pivot away from exploration towards victory points, and how punishing it is to do so too early. Presumably the problem for OpenAI is that this wasn’t a choice, the system cannot play a longer game, so it acted the way it should act with ~6 hours rather than 10, and if you gave it 100 hours instead of 10 it wouldn’t be able to adjust. You’ll have to keep a close eye to fight this when building your larger projects.
WeirdML now has historical scores for older models.
The Kaggle Game Arena will pit LLMs against each other in classic games of skill. They started on Tuesday with Chess, complete with commentary from the very overqualified GM Hikaru, Gotham Chess and Magnus Carlsen.
The contestants for chess were all your favorites: Gemini 2.5 Pro and Flash, Opus 4, o3 and o4-mini, Grok 4, DeepSeek r1 and Kimi-K2.
Thinking Deeply With Gemini 2.5
Gemini 2.5 Deep Think, which uses parallel thinking, is now available for Ultra subscribers. They say it is a faster version of the model that won IMO gold, although this version would only get Bronze. They have it at 34.8% at Humanity’s Last Exam (versus 21.6% for Gemini 2.5 Pro and 20.3% for o3) and 87.6% on LiveCodeBench (verus 72% for o3 and 74.2% for Gemini 2.5 Pro).
The model card is here.
Key facts: 1M token context window, 192k token output. Sparse MoE. Fully multimodal. Trained on ‘novel reinforcement learning techniques.’
As in, this is some deep thinking. If you don’t need deep thinking, don’t call upon it.
Here are their mundane safety evaluations.
The -10% on instruction following is reported as being due to over-refusals.
For frontier safety, Google agrees that it is possible that CBRN thresholds have been reached with this latest round of models, and they have put proactive mitigations in place, in line with RAND SL2, which I would judge as insufficient.
The other frontier safety evaluations are various repetitions of ‘this scores better than previous models, but not enough better for us to worry about it yet.’ That checks with the reports on capability. This isn’t a major leap beyond o3-pro and Opus 4, so it would be surprising if it presented a new safety issue.
One example of this I did not love was on Deceptive Alignment, where they did not finish their testing prior to release, although they say they did enough to be confident it wouldn’t meet the risk thresholds. I would much prefer that we always finish the assessments first and avoid the temptation to rush the product out the door, even if it was in practice fine in this particular case. We need good habits and hard rules.
Deep Thinking isn’t making much of a splash, which is why this isn’t getting its own post. Here are some early reports.
Choose Your Fighter
No one seems to be choosing AWS for their AI workloads, and Jassy’s response to asking why AWS is slow growing was so bad that Amazon stock dropped 4%.
Fun With Media Generation
Olivia Moore has a positive early review of Grok’s Imagine image and video generator, especially its consumer friendliness.
Everyone underestimates the practical importance of UI and ease of use. Marginal quality of output improvements at this point are not so obviously important for most purposes in images or many short videos, compared to ease of use. I don’t usually bother creating AI images mostly because I don’t bother or I can’t think of what I want, not because I can’t find a sufficiently high quality image generator.
How creepy are the latest AI video examples? Disappointingly not creepy.
Optimal Optimization
What does OpenAI optimize for?
You, a fool, who looks at the outputs, strongly suspects engagement and thumbs up and revenue and so on.
OpenAI, a wise and noble corporation, says no, their goal is your life well lived.
Wait, how do they tell the difference between this and approval or a thumbs up?
Well, okay, OpenAI also pays attention to retention. Because that means it is useful, you see. That’s definitely not sycophancy or maximizing for engagement.
That’s why every tech company maximizes for the user being genuinely helped and absolutely nothing else. It’s how they keep the subscriptions up in the long haul.
They do admit things went a tiny little bit wrong with that one version of 4o, but they swear that was a one-time thing, and they’re making some changes:
I notice the default option here is ‘keep chatting.’
These are good patches. But they are patches. They are whack-a-mole where OpenAI is finding particular cases where their maximization schemes go horribly wrong in the most noticeable ways and applying specific pressure on those situations in particular.
What I want to see in such an announcement is OpenAI actually saying they will be optimizing for the right thing, or a less wrong thing, and explaining how they are changing to optimize for that thing. This is a general problem, not a narrow one.
Get My Agent On The Line
Did you know that ‘intelligence’ and ‘agency’ and ‘taste’ are distinct things?
It is well known that AIs can’t have real agency and they can’t write prompts or evals.
Dan Elton is being polite. This is another form of Intelligence Denialism, that a sufficiently advanced intelligence could find it impossible to develop taste or act agentically. This is Obvious Nonsense. If you have sufficiently advanced ‘intelligence’ that contains everything except ‘agency’ and ‘taste’ those remaining elements aren’t going to be a problem.
We keep getting versions of ‘AI will do all the things we want AI to do but mysteriously not do these other things so that humans are still in charge and have value and get what they want (and don’t die).’ It never makes any sense, and when AI starts doing some of the things it would supposedly never do the goalposts get moved and we do it again.
Or even more foolishly, ‘don’t worry, what if we simply did not give AI agents or put them in charge of things, that is a thing humanity will totally choose.’
The full post is ‘keeping AI agents under control doesn’t seem very hard.’ Yes, otherwise serious, smart and very helpful people think ‘oh we simply will not put the AI agents in charge so it doesn’t matter if they are not aligned.’
Says the person already doing (and to his credit admitting doing) the exact opposite.
So it will be fine, because large organizations won’t give AI agents much authority, and there is absolutely no other way for AI agents to cause problems anyway, and no the companies that keep humans constantly in the loop won’t lose out to the others. There will always (this really is his argument) be other bottlenecks that slow things down enough for humans to review what the AI is doing, the humans will understand everything necessary to supervise in this way, and that will solve the problem. The AIs scheming is fine, we deal with scheming all the time in humans, it’s the same thing.
For now we have a more practical barrier, which is that OpenAI Agent has been blocked by Cloudflare. What will people want to do about that? Oh, right.
And what will some of the AI companies do about it?
Kudos to OpenAI and presumably the other top labs for not trying to do an end run. Perplexity, on the other hand? They deny it, but the evidence presented seems rather damning, which means either Cloudflare or Perplexity is outright lying.
This is in addition to Cloudflare’s default setting blocking all AI training crawlers.
Deepfaketown and Botpocalypse Soon
In more ‘Garry Tan has blocked me so I mostly don’t see his takes about why everything will mysteriously end up good, actually’ we also have this:
There are in theory good versions of what Taoki is describing. But no, I do not expect us to end up, by default, with the good versions.
This is at the end of spec’s thread about a ‘renowned clinical psychologist’ who wrote a guest NYT essay about how ChatGPT is ‘eerily effective’ for therapy. Spec says the author was still ‘one shotted’ by his interactions with ChatGPT and offers reasonable evidence of this.
ChatGPT in its current form has some big flaws as a virtual rubber duck. A rubber duck is not sycophantic. If the goal is to see if you endorse your own statements, it helps to not have the therapist keep automatically endorsing them.
That is hard to entirely fix, but not that hard to largely mitigate, and human therapy is inconvenient, expensive and supply limited. Therapy-style AI interactions have a lot of upside if we can adjust to how to have them in healthy fashion.
Justine Moore enjoys letting xAI’s companions Ani and Valentine flirt with each other.
Remember how we were worried about AI’s impact on 2024? There’s always 2028.
We are not ready for a lot of things that are going to happen around 2028. Relatively speaking I expect the election to have bigger concerns than impact from AI, and impact from AI to have bigger concerns than the election. What we learned in 2024 is that a lot of things we thought were ‘supply side’ problems in our elections and political conversation are actually ‘demand side’ problems.
For a while the #1 video on TikTok was an AI fake that successfully fooled a lot of people, with animals supposedly leaving Yellowstone National Park.
You Drive Me Crazy
Reddit’s r/Psychiatry is asked, are we seeing ‘AI psychosis’ in practice? Eliezer points to one answer saying they’ve seen two such patients, if you go to the original thread you get a lot of doctors saying ‘yes I have seen this’ often multiple times, with few saying they haven’t seen it. That of course is anecdata and involves selection bias, and not everyone here is ‘verified’ and people on the internet sometimes lie, but this definitely seems like it is not an obscure corner case.
We definitely don’t have that kind of time. Traditional academic approaches are so slow as to be useless.
They Took Our Jobs
Not only are the radiologists not out of work, they’re raking in the dough, with job openings such as this one offering partners $900k with 14-16 weeks of PTO.
There are indeed still parts of a radiologist job that AI cannot do. There are also parts that could be done by AI, or vastly improved and accelerated by AI, where we haven’t done so yet.
I know what this is going to sound like, but this is what it looks like right before radiologist jobs are largely automated by AI.
Scott Truhlar says in the thread it takes five years to train a radiologist. The marginal value of a radiologist, versus not having one at all, is very high, and paid for by insurance.
So if you expect a very large supply of radiology to come online soon, what is the rational reaction? Doctors in training will choose other specialties more often. If the automation is arriving on an exponential, you should see a growing shortage followed by (if automation is allowed to happen) a rapid glut.
That would be true even if automation arrived ‘on time.’ It is even more true given that it is somewhat delayed. But (aside from the delay itself) it is in no way a knock against the idea that AI will automate radiology or other jobs.
If you’re looking to automate a job, the hardcore move is to get that job and do it first. That way you know what you are dealing with. The even more hardcore move of course is to then not tell anyone that you automated the job.
I once did automate a number of jobs, and I absolutely did the job myself at varying levels of automation as we figured out how to do it.
Get Involved
UK AISI taking applications for research in economic theory and game theory, in particular information design, robust mechanism design, bounded rationality, and open-source game theory, collusion and commitment.
You love to see it. These are immensely underexplored topics that could turn out to have extremely high leverage and everyone should be able to agree to fund orders of magnitude more such research.
I do not expect to find a mechanism design that gets us out of our ultimate problems, but it can help a ton along the way, and it can give us much better insight into what our ultimate problems will look like. Demonstrating these problems are real and what they look like would already be a huge win. Proving they can’t be solved, or can’t be solved under current conditions, would be even better.
(Of course actually finding solutions that work would be better still, if they exist.)
DARPA was directed by the AI Action Plan to invest in AI interpretability efforts, which Sunny Gandhi traces back to Encode and IFP’s proposal.
Palisade Research is offering up to $1k per submission for examples of AI agents that lie, cheat or scheme, also known as ‘free money.’ Okay, it’s not quite that easy given the details, but it definitely sounds super doable.
A challenge from Andrej Karpathy, I will quote in full:
Y Combinator is hosting a hackathon on Saturday, winner gets a YC interview.
Introducing
Anthropic offers a free 3-4 hour course in AI Fluency.
The Digital Health Ecosystem, a government initiative to ‘bring healthcare into the digital age’ including unified EMR standards, with partners including OpenAI, Anthropic, Google, Apple and Microsoft. It will be opt-in without a government database. In theory push button access will give your doctor all your records.
Gemini Storybook, you describe the story you want and get a 10 page illustrated storybook. I’m skeptical that we actually want this but early indications are it does a solid job of the assigned task.
ElevenLabs launches an AI Music Service using only licensed training data, meaning anything you create with it will be fully in the clear.
City In A Bottle
Google’s Genie 3, giving you interactive, persistent, playable environments with prompted world events from a single prompt.
This is a major leap over similar previous products, both at Google and otherwise.
The examples look amazing, super cool.
It’s not that close as a consumer product. There it faces the same issues as virtual reality. It’s a neat trick and can generate cool moments, that doesn’t turn into a compelling game, product or experience. That will remain elusive and likely remains several steps away. We will get there, but I expect a substantial period where it feels like it ‘should’ be awesome to use and in practice it isn’t yet.
I do expect ‘fun to play around’ for a little bit but only a very little bit.
Well yes everything is cherrypicked but I don’t think that matters much. It is more that you can show someone ‘anything at all’ that looks cool a lot easier than the particular thing you want, and for worlds that problem is much worse than movies.
The use case that matters is providing a training playground for robotics and agents.
Those who were optimistic about its application for robotics often were very excited about that.
Unprompted Suggestions
One weird trick, put your demos at the start not the end.
I am skeptical about the claimed degree of impact but I buy the principle that examples at the end warp continuations and you can get a better deal the other way.
In Other AI News
Nvidia software had a vulnerability allowing root access, which would allow stealing of others model weights on shared machines, or stealing or altering of data.
I mean I thought we settled that one with ‘we will give the AGIs access to the internet.’
Birds can store data and do transfer at 2 MB/s data speeds. I mention this as part of the ‘sufficiently advanced AI will surprise you in how it gets around the restrictions you set up for it’ set of intuition pumps.
Rohit refers us to the new XBai o4 as ‘another seemingly killer open source model from China!!’ Which is not the flex one might think, it is reflective of the Chinese presenting every release along with its amazing benchmarks as a new killer and it mostly then never being heard from again when people try it in the real world. The exceptions that turned out to be clearly legit so far are DeepSeek and Kimi. That’s not to say that I verified XBai o4 is not legit, but so far I haven’t heard from it again.
NYT reporter Cate Metz really is the worst, and with respect to those trying not to die is very much Out To Get You by any means necessary. We need to be careful to distinguish him from the rest of the New York Times which is not ideal but very much not a monolith.
Papers, Please
A new paper analyzes how the whole ‘attention’ thing actually works.
The Mask Comes Off
Here are the seven questions. Here is the full letter. Here is a thread about it.
I do think they should have to answer (more than only, but at least) these questions. We already know many of the answers, but it would good for them to confirm them explicitly. I have signed the letter.
Show Me the Money
OpenAI raises another small capital round of $8.3 billion at $300 billion. This seems bearish, if xAI is approaching $200 billion and Anthropic is talking about $160 billion and Meta is offering a billion for various employees why isn’t OpenAI getting a big bump? The deal with Microsoft and the nonprofit are presumably holding them back.
After I wrote that, I then saw that OpenAI is in talks for a share sale at $500 billion. That number makes a lot more sense. It must be nice to get in on obviously underpriced rounds.
Anthropic gets almost half its API revenue from Cursor and GitHub, also it now has more API revenue than OpenAI. OpenAI maintains its lead because it owns consumer subscriptions and ‘business and partner.’
Anthropic has focused on coding. So far it is winning that space, and that space is a large portion of the overall space. It has what I consider an excellent product outside of coding, but has struggled to gain mainstream consumer market share due to lack of visibility and not keeping up with some features. I expect Anthropic to try harder to compete in those other areas soon but their general strategy seems to be working.
A fun graph from OpenRouter:
Bingo, I presume. Here’s an obviously wrong explanation where Grok dies hard:
No, Claude did not suddenly steal most of OpenAI’s queries. Stop asking Grok things.
Lulu Meservey equates the AI talent war to a Religious Victory in a game of Civ 6, in which you must convince others of your vision of the future.
That is a lot of different ways of saying mostly the same thing. Be a great place to work on building the next big thing the way you want to build it.
However, I notice Lulu’s statement downthread that we won the Cold War because we had Reagan and our vision of the future was better. Common mistake. We won primarily because our economic system was vastly superior. The parallel here applies.
A question for which the answer seems to be 2025, or perhaps 2026:
People turning down the huge Meta pay packages continues to be suggestive of massive future progress, and evidence for it, but far from conclusive.
The obvious response is, Andrew was at Meta for 11 years, so he knows what it would be like to go back, and also he doesn’t have to. Also you can have better opportunities without an imminent singularity, although it is harder.
Tyler Cowen analyzes Meta’s willingness to offer the billion dollar packages, finding them easily justified despite Tyler’s skepticism about superintelligence, because Meta is worth $2 trillion and that is relying on the quality of its AI. For a truly top talent, $1 billion is a bargain.
Where we disagree is that Tyler attributes the growth in valuation of Meta in the last few years, where it went from ~$200 billion to ~$2 trillion, as primarily driven by market expectations for AI. I do not think that is the case. I think it is primarily driven by the profitability of its existing social media services. Yes some of that is AI’s ability to enhance that profitability, but I do not think investors are primarily bidding that high because of Meta’s future as an AI company. If they did, they’d be wise to instead pour that money into better AI companies, starting with Google.
Quiet Speculations
Given that human existence is in large part a highly leveraged bet against the near-term existent of AGI, Dean’s position here seems like a real problem if true:
It is an especially big problem if our government thinks of the situation this way. If we think that we are doomed without AGI because of government debt or lack of growth, that is the ultimate ‘doomer’ position, and they will force the road to AGI even if they realize it puts us all in grave danger.
The good news is that I do not think Dean Ball is correct here.
Nor do I think that making additional practical progress has to lead directly to AGI. As in, I strongly disagree with Roon here:
There is absolutely room for middle ground.
As in, I think our investments can pay off without AGI. There is tremendous utility in AI without it being human level across cognition or otherwise being sufficiently capable to automate R&D, create superintelligence or pose that much existential risk. Even today’s levels of capabilities can still pay off our investments, and modestly improved versions (like what we expect from GPT-5) can do better still. Due to the rate of depreciation, our current capex investments have to pay off rapidly in any case.
I even think there are other ways out of our fiscal problems, if we had the will, even if AI doesn’t serve as a major driver of economic growth. We have so much unlocked potential in other ways. All we really have to do is get out of our own way and let people do such things as build houses where people want to live, combine that with unlimited high skilled immigration, and we would handle our debt problem.
Some people look at this and say ‘infinite distance from profitable.’
I say ‘remarkably close to profitable, look at the excellent unit economics.’
What I see are $4 billion in revenue against $2 billion in strict marginal costs, maybe call it $3.5 billion if you count everything to the maximum including the Microsoft revenue share. So all you have to do to fix that is scale up. I wouldn’t be so worried.
Indeed, as others have said, if OpenAI was profitable that would be a highly bearish signal. Why would it be choosing to make money?
And indeed, they are scaling very quickly by ‘ordinary business’ standards.
OpenAI took a $5 billion loss in 2024, but they are tripling their revenue from $4 billion to $12 billion in 2025. If they (foolishly) held investment constant (which they won’t do) this would make them profitable in 2026.
Jacob Trefethen asks what AI progress means for medical progress.
As per usual this is a vision of non-transformational versions of AI, where it takes 10+ years to meaningfully interact with the physical world and its capabilities don’t much otherwise advance. In that case, we can solve a number of bottlenecks, but others remain, although I question #8 and #9 as true bottlenecks here, plus ambition should be highly responsive to increased capability to match those ambitions. The physical costs in #7 are much easier to solve if we are much richer, as we should be much more willing to pay them, even if AI cannot improve our manufacturing and delivery methods, which again is rather unambitious perspective.
The thing about solving #1, #2 and #3 is that this radically improves the payoff matrix. A clinical trial can be thought of as solving two mostly distinct problems.
Even without any reforms, AI can transition clinical trials into mostly being #2. That design works differently, you can design much cheaper tests if you already know the answer, and you avoid the tests that were going to fail.
How fast will the intelligence explosion be? Tom Davidson has a thread explaining how he models this question and gets this answer, as well as a full paper, where things race ahead but then the inability to scale up compute as fast slows things down once efficiency gains hit their effective limits:
We should proceed cautiously in any case. This kind of mapping makes assumptions about what ‘years of progress’ looks like, equating it to lines on graphs. The main thing is that, if you get ‘6 years of progress’ past the point where you’re getting a rapid 6 years of progress, the end result is fully transformative levels of superintelligence.
Mark Zuckerberg Spreads Confusion
Mark Zuckerberg seems to think, wants to convince us, that superintelligence means really cool smart glasses and optimizing the Reels algorithm.
Is this lying, is it sincere misunderstanding, or is he choosing to misunderstand?
As Hashim points out, it ultimately does not matter what you want the ‘superintelligence’ to be used for if you give it to the people, as he says he wants to do.
There are two other ways in which this could matter a lot.
The first one would be great. I am marginally sad about but ultimately fine with Meta optimizing its Reels algorithm or selling us smart glasses with a less stupid assistant. Whereas if Meta builds an actual superintelligence, presumably everyone dies.
The second one would be terrible. I am so sick of this happening to word after word.
The editors of The Free Press were understandably confused by Zuckerberg’s statement, and asked various people ‘what is superintelligence, anyway?’ Certainly there is no universally agreed definition.
I think that o3 pro’s answer here is pretty good. The key thing is that both of these answers have nothing to do with Zuckerberg’s vision or definition of ‘superintelligence.’ Tyler thinks we won’t get superintelligence any time soon (although he thinks o3 counts as AGI), which is a valid prediction, as opposed to Zuckerberg’s move of trying to ruin the term ‘superintelligence.’
By contrast, Matt Britton then goes Full Zuckerberg (never go full Zuckerberg, especially if you are Zuckerberg) and says ‘In Many Ways, Superintelligence Is Already Here’ while also saying Obvious Nonsense like ‘AI will never have the emotional intelligence that comes from falling in love or seeing the birth of a child.’ Stop It. That’s Obvious Nonsense, and also words have meaning. Yes, we have electronic devices and AIs that can do things humans cannot do, that is a different thing.
Aravind Srinivas (CEO of Perplexity) declines to answer and instead says ‘the most powerful use of AI will be to expand curiosity’ without any evidence because that sounds nice, and says ‘kudos to Mark and anyone else who has a big vision and works relentlessly to achieve it’ when Mark actually has the very small mission of selling more ads.
Nicholas Carr correctly labels Zuck’s mission as the expansion of his social engineering project and correctly tells us to ignore his talk of ‘superintelligence.’ Great answer. He doesn’t try to define superintelligence but it’s irrelevant here.
Eugenia Kuyda (CEO of Replica) correctly realizes that ‘we focus too much on what AI can do for us and not enough on what it can do to us’ but then focuses on questions like ‘emotional well-being.’ He correctly points out that different versions of AI products might optimize in ways hostile to humans, or in ways that promote human flourishing.
Alas, he then thinks of this as a software design problem for how our individualized AIs will interact with us on a detailed personal level, treating this all as an extension of the internet and social media mental health problems, rather than asking how such future AIs will transform the world more broadly.
Similarly, he buys into this ‘personal superintelligence’ line without pausing to realize that’s not superintelligence, or that if it was superintelligence it would be used for quite a different purpose.
This survey post was highly useful, because it illustrated that yes Zuckerberg seems to successfully be creating deep confusions about the term superintelligence with which major tech CEOs are willing to play along, potentially rendering the term superintelligence meaningless if we are not careful. Also those CEOs don’t seem to grasp the most important implications of AI, at all.
That’s not super. Thanks for asking.
The Quest for Sane Regulations
As I said in response to Zuckerberg last week, what you want intelligence or any other technology to be used for when you build it has very little to do with what it will actually end up being used for, unless you intervene to force a different outcome.
Even if AGI or superintelligence goes well, if we choose to move forward with developing it (and yes this is a choice), we will face choices were all options are currently unthinkable, either in their actions or their consequences or both.
Also ‘going trans- or post-human in a generation or two’ is what you are hoping for when you create superintelligence (ASI). That seems like a supremely optimistic timeline for such things to happen, and a supremely optimistic set of things that happens relative to other options. If you can’t enthusiastically endorse that outcome, were it to happen, then you should be yelling at us to stop.
As for Samuel’s other example, there are a lot of people who seem to think you can give everyone their own superintelligence, not put constraints on what they do with it or otherwise restrict their freedoms, and the world doesn’t quickly transform itself into something very different that fails to preserve what we cared about when choosing to proceed that way. Those people are not taking this seriously.
Seán Ó hÉigeartaigh once again reminds us that it’s not that China has shown no interest in AI risks, it is that China’s attempts to cooperate on AI safety issues have consistently been rebuffed by the United States. That doesn’t mean that China is all that serious about existential risk, but same goes for our own government, and we’ve consistently made it clear we are unwilling to cooperate on safety issues and want to shut China out of conversations. It is not only possible but common in geopolitics to compete against a rival while cooperating on issues like this, we simply choose not to.
On the flip side, why is it that we are tracking the American labs that sign the EU AI Act Code of Practices but not the Chinese labs? Presumably because we no one expects the Chinese companies to sign the code of practices, which puts them in the rogue group with Meta, only more so as they were already refusing to engage with EU regulators in general. So there was no reason to bother asking.
Governor DeSantis indicates AI regulations are coming to Florida.
We will see what he comes up with. Given his specific concerns we should not have high hopes for this, but you never know.
David Sacks Once Again Amplifies Obvious Nonsense
Peter Wildeford also does the standard work of explaining once again that when China releases a model with good benchmarks that is the standard amount behind American models, no that does not even mean anything went wrong. And even if it was a good model, sir, it does not mean that you should respond by abandoning the exact thing that best secures our lead, which is our advantage in compute.
This is in the context of the release of z.AI’s GLM-4.5. That release didn’t even come up on my usual radars until I saw Aaron Ginn’s Obvious Nonsense backwards WSJ op-ed using this as the latest ‘oh the Chinese have a model with good benchmarks so I guess the export restrictions are backfiring.’ Which I would ignore if we didn’t have AI Czar David Sacks amplifying it.
Why do places like WSJ, let alone our actual AI Czar, continue to repeat this argument:
We can and should, as the AI Action Plan itself implores, tighten the export controls, especially the enforcement thereof.
What about the actual model from z.AI, GLM 4.5? Is it any good?
That last line should be a full stop in terms of this being worrisome. Months later than DeepSeek’s release, GLM-4.5 got released, and it is worse (or at least not substantially better) than DeepSeek’s release, which was months behind even at its peak.
Remember that Chinese models reliably underperform their benchmarks. DeepSeek I mostly trust not to be blatantly gaming the benchmarks. GLM-4.5? Not so much. So not only are these benchmarks not so impressive, they probably massively overrepresent the quality of the model.
Oh, and then there’s this:
Add in the fact that I hadn’t otherwise heard a peep. In the cases where a Chinese model was actually good, Kimi K2 and DeepSeek’s v3 and r1, I got many alerts to this.
When I asked in response to this, I did get informed that it does similarly to the top other Chinese lab performances (by Qwen 3 and Kimi-K2) on Weird ML, and Teortaxes said it was a good model, sir and says its small model is useful but confirmed it is in no way a breakthrough.
Chip City
We now move on to the WSJ op-ed’s even worse claims about chips. Once again:
If you’re not yet in ‘stop, stop, he’s already dead’ mode Peter has more at the link.
The op-ed contains lie after falsehood after lie and I only have so much space and time. Its model of what to do about all this is completely incoherent, saying we should directly empower our rival, presumably to maintain chip market share, which wouldn’t even change since Nvidia literally is going to sell every chip it makes no matter what if they choose to sell them.
This really is not complicated:
Selling H20s to China does not ‘promote the American tech stack’ or help American AI. It directly powers DeepSeek inference for the Chinese military.
Here’s backup against interest from everyone’s favorite DeepSeek booster. I don’t think things are anything like this close but otherwise yeah, basically:
We also are giving up the ‘recruit from Chinese universities’ (and otherwise stealing the top talent) advantage due to immigration restrictions. It’s all unforced errors.
No Chip City
My lord.
Actually imposing this would be actually suicidal, if he actually meant it. He doesn’t.
As announced this is not designed to actually get implemented at all. If you listen to the video, he’s planning to suspend the tariff ‘if you are building in the USA’ even if your American production is not online yet.
So this is pure coercion. The American chip market is being held hostage to ‘building in the USA,’ which presumably TSMC will qualify for, and Apple qualifies for, and Nvidia would presumably find a way to qualify for, and so on. It sounds like it’s all or nothing, so it seems unlikely this will be worth much.
The precedent is mind boggling. Trump is saying that he can and will decide to charge or not charge limitless money to companies, essentially destroying their entire business, based on whether they do a thing he likes that involved spending billions of dollars in a particular way. How do you think that goes? Solve for the equilibrium.
Meanwhile, this does mean we are effectively banning chip imports from all but the major corporations that can afford to ‘do an investment’ at home to placate him. There will be no competition to challenge them, if this sticks. Or there will be, but we won’t be able to buy any of those chips, and will be at a severe disadvantage.
The mind boggles.
Energy Crisis
It seems right, given the current situation, to cover our failures in energy as part of AI.
Then after I wrote that, Burgum went after solar and wind on federal land again, with an order to consider ‘capacity density’ because solar and wind might take too much land. This is Obvious Nonsense, we are not suffering from a shortage of such land and if we do then perhaps you should charge the market price.
And that’s with all the barriers that were already in place. Imagine if we actually encouraged this energy source (or if we repealed the Jones Act and otherwise got to work on doing actual shipbuilding, but that’s another story.)
This among other actions sure looks like active purely malicious sabotage:
Why is there a War on Wind Turbines? I won’t speculate, but this makes a mockery of pretty much everything, both involving AI and otherwise.
The Trump Administration version of the U.S. Department of Energy is once again actively attacking the idea of using renewable energy and batteries as sources of electrical power in general, using obvious nonsense. If you’re serious about ‘winning the AI race,’ or ‘beating China’ both in AI and in general? Please come get your boy.
Meanwhile, Google announces ‘they’ve found a way to shift compute tasks – and most notably ML workloads – to help meet the world’s growing energy needs while minimizing the time and costs required to add new generation to the system.’ They’ve signed long term contracts with local American power authorities. As in, they’re going to be able to make better use of wind and solar. The same wind and solar that our government is actively working to sabotage.
Whereas our Secretary of Energy is being forced to say things like this:
To avoid confusion, I do fully support this last initiative: Norway seems like a fine place to put your data center.
To The Moon
The whole ‘we have to’ ‘win the race’ and ‘beat China’ thing, except instead of racing to superintelligence (likely suicidal, but with an underlying logic) or AI chip market share (likely suicidal, with a different motivation), it’s… (checks notes) putting the first nuclear reactor on the moon.
I’m all in favor of building a nuclear reactor on the moon. I mean, sure, why not? But please recognize the rhetoric here so that you can recognize it everywhere else. Nothing about this ‘second space race’ makes any sense.
Also, no, sorry, moon should not be a state unless and until we put millions of people up there, stop trying to further screw up the Senate.
Dario’s Dismissal Deeply Disappoints, Depending on Details
Dario Amodei talked to Alex Kantrowitz. I haven’t listened to the whole thing but this was highlighted, and what were to me and many others the instinctive and natural interpretations (although others had a different instinct here) are rather terrible.
I don’t think he meant it the way it sounds? But we really need clarification on that. So for the benefit of those relevant to this, I’m going into the weeds.
When looked at what I consider to be the natural way this is deeply disappointing and alarming, as is the proceeding ‘well everyone else will race so what choice do we have’ style talk, although it is followed by him pushing back against those who are most actively against trying to not die, such as advocates of the insane moratorium, or those who say anyone worrying about safety must be motivated by money (although the danger there is to then stake out a supposed ‘middle ground’).
Yes, he does explicitly say that the idea of ‘dangers to humanity as a whole’ ‘makes sense to him,’ but oh man that is the lowest of bars to be clearing here. Making sense is here contrasting with ‘gobbledegook’ rather than being ‘I agree with this.’
This is the strongest argument that Dario likely didn’t intend the second thing:
There are two big questions raised by this: Substantive and rhetorical.
The substantive question is, what exactly is Dario dismissing here?
If it is #2, we have to adjust our view of Anthropic in light of this information.
Eliezer interpreted this as the second statement. I think he’s overconfident in this interpretation, but that it was also my initial intuition, and that rises to the level of ‘oh man you really need to clear this up right now if you didn’t intend that.’
Leon Lang suggested the alternative interpretation, which I also considered unprompted. Andrew Critch also questions Eliezer’s interpretation, as do Isaac King, Catherine Olsson and Daniel Eth.
As I’ve laid out, I think Dario’s statements are ambiguous as to which interpretation Dario intended, but that the natural interpretation by a typical listener would be closer to the second interpretation, and that he had enough info to realize this.
I sincerely hope that Dario meant the first interpretation and merely worded it poorly. If this is pushback against using the label ‘doomer’ then you love to see it, and pushing back purely against the absolutist ‘I’ve proven this can never work’ is fine.
Using ‘doomer’ to refer to those who point out that superintelligent (he typically says ‘powerful’) AI likely would kill us continues to essentially be a slur.
That’s not being a doomer, that’s having a realistic perspective on the problems ahead. That’s what I call ‘the worried.’ The term ‘doomer’ in an AI context should be reserved for those who are proclaiming certain doom, that the problems are fully unsolvable.
The other question is rhetorical.
Dario Amodei is CEO of Anthropic. Anthropic’s supposed reason to exist is that OpenAI wasn’t taking its safety responsibilities seriously, especially with respect to existential risk, and those involved did not want everyone to die.
Even if Dario meant the reasonable thing, why is he presenting it here in this fashion, in a way that makes the interpretations we are worried about the default way that many heard his statements? Why no clarification? Why the consistent pattern of attacking and dismissing concerns in ways that give this impression so often?
Yes this is off the cuff but he should have a lot of practice with such statements by now. And again, all the more reason to clarify, which is all I am requesting.
Suppose that Dario believes that the problem is difficult (he consistently gives p(doom) in the 10%-25% range when he answers that question, I believe), but disagrees with the arguments for higher numbers. Again, that’s fine, but why state your disagreement via characterization that sounds like grouping in such arguments as ‘gobbledegook,’ which I consider at least a step beyond Obvious Nonsense?
There is a huge difference between saying ‘I believe that [X] is wrong’ and saying ‘[X] is gobbledegook.’ I do that second statement, if applied to arguments on the level of Eliezer’s, crosses the line into dangerously at least one of either confused or dishonest. Similarly, saying ‘the argument for [X] makes sense for me’ is not ‘I agree with the argument for [X].’
If Dario simply meant ‘there exist arguments for [X] that are gobbledegook’ then that is true for essentially any [X] under serious debate, so why present it this way?
I have reached out internally to Dario Amodei via Anthropic’s press contact to ask for clarification. I have also asked openly on Twitter. I have not yet received a reply.
If I was Anthropic leadership, especially if this is all an overreaction, I would clarify. Even if you think the overreaction is foolish and silly, it happened, you need to fix.
If I was an Anthropic employee, and we continue to not see clarification, then I would be asking hard questions of leadership.
The Week in Audio
Odd Lots covers the hyperbolic growth in AI researcher salaries. I was in the running to join this one, but alas I had to tell them I was not quite The Perfect Guest this time around.
Several people at OpenAI including Sam Altman praised this interview with Mark Chen and Jakub Pachocki, the twin heads of their research division. Mostly this is a high level semi-puff piece, but there are some moments.
One should ask the question, but I’d like to hope Chen has good answers for it?
You know this one already but others don’t and it bears repeating:
I am not finding their response on superalignment, shall we say, satisfactory.
I’m going to quote this part extensively because it paints a very clear picture. Long term concern have been pushed aside to focus on practical concerns, safety is to serve the utility of current projects, and Leike left because he didn’t like this new research direction of not focusing on figuring out how to ensure we don’t all die.
They could not be clearer that this reflects a very real dismissal of the goals of superalignment. That doesn’t mean OpenAI isn’t doing a lot of valuable alignment work, but this is confirmation of what we worried about, and they seem proud to share the article in which they confirm this.
Demis Hassabis talks to Steven Levy on The Future of Work.
Dwarkesh Patel points out the obvious, that people will not only not demand human interactions if they can get the same result without one, they will welcome it, the same way they do Waymo. Interacting with humans to get the thing you want is mostly terrible and annoying, if the AI or automated system was actually better at it. The reason we demand to talk to humans right now is that the AI or automated system sucks. The full podcast is here, Dwarkesh and Noah Smith are talking to Erik Torenberg.
Tyler Cowen Watch
Zhengdong Wang discusses with Tyler Cowen how his AI views have changed in the past two years, and there was a good transcript so I decided to check this one.
This stood out to me:
The half in jest was news. I thought his statements that o3 was AGI were very clear, and have quoted him claiming this many times. The further discussion makes it clear he thinks of AGI as a local, ‘better than me’ phenomenon, or a general ‘better at what people ask’ thing perhaps, so it doesn’t match up with what I think the term means and doesn’t seem like an important threshold, so I’ll stop using his claim.
Um, yes. What matters is future progress, not current impact on our basket of goods. It is so bizarre to see Tyler literally go through a list of AI impact on rent and price of food and such as the primary impact measure.
His recommendations to 10 Downing Street were similar. He’s simply not feeling the AGI that I am feeling, at all, let alone superintelligence. It’s not in his model of the future. He thinks answers five years from now won’t be that much better, that intelligence effectively caps out one way or another. He’s focused on practical concerns and market dynamics and who can catch up to who, which I notice involves various companies all of which are American.
Here’s his current model preference:
He sees Anthropic (Claude) as being ‘for business uses.’ I think he’s missing out. He says he actually uses Grok for information like ‘what is in the BBB?’ and that was before Grok 4 was even out so that confused me.
Tyler Cowen has joined the alliance of people who know that AI poses an existential risk to humanity, and strategically choose to respond to this by ignoring such questions entirely. For a while this crowd dismissed such worries and tried to give reasons for that, but they’ve given that up, and merely talk about a future in which AI doesn’t change much, the world doesn’t change much, and the questions never come up. It’s frustrating.
Tyler is doing the virtuous version of this at the level of actually is making the clear prediction that AI capabilities will stall out not far from where they are now, and from here it’s about figuring out how to use it and get people to use it. He’s mostly modeling diffusion of current capabilities. And that’s a world that could exist, and he rightfully points out that even then AI is going to be a huge deal.
His justification of lack of future progress seems like intelligence denialism, the idea that it isn’t possible to give meaningfully ‘better answers’ to questions. I continue to think that should be Obvious Nonsense, yet clearly to many it is not obvious.
How much Alpha is there in pointing this dynamic out over and over? I don’t know, but it feels obligatory to not allow this move to work. I’m happy to discuss the mundane matters too, that’s what I do the bulk of the time.
Rhetorical Innovation
Eliezer tries out the metaphor of comparing ChatGPT’s issues (and those of other LLMs) to those of Waymo, where the contrast in what we are willing to tolerate is stark, where actual Waymos not only don’t run over jaywalkers they are vastly safer than human drivers whereas LLMs kind of drive some of their users insane (or more insane, or to do insane things) in a way that could in some senses be called ‘deliberate.’
Alas, this is straight up accurate:
One reason academia seems determined to be of no help whatsoever:
Feels harsh? Feels insane, and yes that seems like a very distinct universe, one where you cannot respond to ‘this research does not impact what is happening to minority communities today, this only impacts people in the future, therefore you should not have done your research [you racist],’ nor can you simply ignore it.
Then again, is the failure to be able to brush past it a skill issue?
I don’t know. I’m not the one who would have to risk looking racist if the explanation or argument goes sufficiently poorly.
Sonnet 3 is no longer available. Consider this the latest example of the phenomenon of ‘the sun is big, but superintelligences will not spare Earth a little sunlight.’
We will keep running (or ‘resurrect’) Sonnet 3 if and only if someone or some AI with the necessary resources wants to pay what it costs to do that, the same way the humans will or won’t be kept alive. The fact that Sonnet 3 requires a very small amount of money or compute to keep running, relative to total compute and money, is not relevant, including all the ways in which doing so would be annoying or inconvenient, and all transaction costs and coordination required, and so on.
Would I have preserved some amount of access to Sonnet 3? Yes, because I think the goodwill gained from doing so alone justifies doing so, and there could also be research and other benefits. But I am very unsurprised that it did not pass the bar to make this happen.
Shame Be Upon Them
Your periodic reminder, for those who need to hear it:
When people create, repeat or amplify rhetoric designed exclusively to lower the status of and spread malice and bad vibes towards anyone who dares point out that AI might kill everyone, especially when they do that misleadingly, it successfully makes my day worse. It also dishonors and discredits you.
There are various levels of severity to this. I adjust accordingly.
One of the purposes of this newsletter, in which my loss is your gain, is that I have to track all such sources, many of which are sufficiently important or otherwise valuable I have to continue to monitor them, and continuously incur this damage.
You’re welcome.
Correlation Causes Causation
Correlation does not imply causation. Not in general.
In LLMs it is another story. LLMs are correlation machines. So if [X] is correlated with [Y], invoking [X] will also somewhat invoke [Y].
Everything is connected to everything else. When you train for [X] you train for [Y], the set of things correlated with [X]. When you put [X] in the context window, the same thing happens. And so on.
The question is magnitude. It this a big deal? It might be a big deal.
Warmth and honesty conflict in humans far more than people want to admit. A lot of demands on behavior are largely requests to lie your ass off, or at least not to reveal important truths.
Warm models might adhere to some safety guardrails, but what is being described here is a clear failure of a different kind of safety.
Give me New York nice over San Francisco nice every day.
Aligning a Smarter Than Human Intelligence is Difficult
The Frontier Model Form offers a technical report on third party assessments, which they primarily see serving as confirmation, robustness or supplementation for internal assessments.
As I failed to find anything non-obvious or all that technical in the report I decided to only spot check it. It seems good for what it is, if labs or those marketing to labs feel they need this laid out. If everyone can agree on such basics that seems great. You Should Know This Already does not mean Everybody Knows.
As usual, these discussions seem designed to generate safety performance assessments that might be sufficient now, rather than what will work later.
It is far worse than this, because the larger bridge is not going to be intelligent or adversarial and its behavior is simple physics.
Eliezer Yudkowsky gives an extended opinion on recent Anthropic safety research. His perspective is it is helpful and worth doing and much better than the not looking that other companies do, and is especially valuable at getting people to sit up and pay attention, but he is skeptical it generalizes too broadly or means what they think it means and none of it updates his broader models substantially because all of it was already priced in.
The Lighter Side
Stop blaming the tea, you’re the one spilling it.
How to win at Twitter: