xAI Colossus 2 is now the first gigawatt datacenter in the world, completed in six months, poising them to leapfrog rivals in training compute at the cost of tens of billions of capex spending.
This is incorrect, so far they've only built a 200 MW part of a possible future 1 GW system (the same as the current 2 out of 8 buildings at the Crusoe/OpenAI Abilene site, and likely xAI got there notably later, even if faster from start to finish). "Tens of billions of capex spending" describe a 1 GW system, a 200 MW part might cost about $10bn (including the buildings etc.). The SemiAnalysis article is misleadingly titled (referring to the future plan), but it's all there in the article:
The Colossus 2 project was kicked off on March 7th, 2025, when xAI acquired a 1m sqft warehouse in Memphis, and two adjacent sites totaling 100 acres. By August 22nd, 2025, we count 119 air-cooled chillers on site, i.e. roughly 200MW of cooling capacity.
Juliana’s case is a tragedy, but the details are if anything exonerating.
I think you perhaps didn't dig into this and didn't see the part in the complaint where Character.ai bots engage in pretty graphic descriptions of sex with Juliana, who was a child of 13.
In the worst example I saw, she told the bot to stop its description of a sexual act against her and it refused. Screenshot from the complaint below (it is graphic):
Screenshot
We can't know if this had a lasting effect on her mental health, or contributed at all to her suicide, but I think saying "the details are if anything exonerating" is wrong.
SemiAnalysis analyzes Huawei’s production, and reports that the export controls are absolutely working to hurt their production of chips, which if we prevent smuggling will not only not scale in 2026 but will actively fall sharply to below 2024 levels, as they have been relying on purchases from Samsung that will soon run dry.
This is false for compute dies, which the article says will actually get manufactured in volume soon. The claim is about HBM specifically, which is necessary for decoding/generation (processing of output tokens), but not for pretraining or processing of input tokens.
So if they develop chips and systems without HBM, then at several times the cost they might be able to obtain a lot of pretraining capability soon. Not very useful directly without a lot of RLVR and inference compute though, and no announcements to the effect that this is being attempted.
Eliezer Yudkowsky: In the limit, there is zero alpha for multiple agents over one agent, on any task, ever. So the Bitter Lesson applies in full to your clever multi-agent framework; it’s just you awkwardly trying to hardcode stuff that SGD can better bake into a single agent.
Obviously if you let the “multi-agent” setup use more compute, it can beat a more efficient single agent with less compute.
A lot of things true at the limit are false in practice. This is one of them, but it is true that the better the agents relative to the task, the more unified a solution you want.
If a model is smart enough/powerful enough, it can simulate a multi-agent interaction (or, even literally host multiple agents within its neural machine if makers of a specialized model want it to do that).
But yes, this imposes a certain prior, and in the limit one might want not to impose priors (although is it really true that we don't want to impose priors if we also care about AI existential safety? a multi-agent setup might be more tractable from the existential safety point of view).
Spelling 'Soares' wrong after 'Eliezer Yudkowsky' must be the literary equivalent of tripping on the red carpet right after sticking the Olympic landing.
Hieu Pham: There will be some people disagreeing this is AGI. I have no words for them. Hats off. Congrats to the team that made this happen.
There are some people who (still!) think the ability to win math contests is "general" intelligence. I have no words for them.
Now that the value of OpenAI minus the nonprofit’s share has tripled to $500 billion, that is even more true. We are far closer to the end of the waterfall. The nonprofit’s net present value expected share of future profits has risen quite a lot. They must be compensated accordingly, as well as for the reduction in their control rights, and the attorneys general must ensure this.
I think this reasoning is flawed, but my understanding of economics is pretty limited so take my opinion with a grain of salt.
I think it's flawed in that investors may have priced in the fact that the fancy nonprofit, the AGI dream, & whatnot, where mostly a dumb show. So 500 G$ is closer to the full value of OpenAI, rather than close to the value left out of the nonprofit according to the current setup interpreted to the letter.
It is book week. As in the new book by Eliezer Yudkowsky and Nate Sores, If Anyone Builds It, Everyone Dies. Yesterday I gathered various people’s reviews together. Going home from the airport, I saw an ad for it riding the subway. Tomorrow, I’ll post my full review, which goes over the book extensively, and which subscribers got in their inboxes last week.
The rest of the AI world cooperated by not overshadowing the book, while still doing plenty, such as releasing a GPT-5 variant specialized for Codex, acing another top programming competition, attempting to expropriate the OpenAI nonprofit in one of the largest thefts in human history and getting sued again for wrongful death.
You know. The usual.
Table of Contents
Language Models Offer Mundane Utility
Ethan Mollick discusses the problem of working with wizards, now that we have AIs that will go off and think and come back with impressive results in response to vague requests, with no ability to meaningfully intervene during the process. The first comment of course notes the famously wise words: “Do not meddle in the affairs of wizards, for they are subtle and quick to anger.”
I do not think ‘AI is evil,’ but it is strange how people think that showing AI having a good effect in one case is often considered a strong argument that AI is good, either current AI or even all future more capable AIs. As an example that also belongs here:
We talk about AI having diminishing returns to scale, where you need to throw 10 times as much compute on things to get modestly better performance. But that doesn’t have to mean diminishing marginal returns in utility. If you can now handle tasks better, more consistently, and for longer, you can get practical returns that are much more valuable. A new paper argues that not appreciating the value of task length is why we see ‘The Illusion of Diminishing Returns.’
I think it is the most useful to talk about diminishing returns, and then talk about increasing value you can get from those diminishing returns. But the right frame to use depends heavily on context.
Sarah Constantin has vibe coded a dispute resolution app, and offers the code and the chance to try it out, while reporting lessons learned. One lesson was that the internet was so Big Mad about this that she felt the need to take her Twitter account private, whereas this seems to me to be a very obviously good thing to try out. Obviously one should not use it for any serious dispute with stakes.
Anthropic offers a new report analyzing the data from their Economic Index.
The wealthier and more advanced a place is, the more it uses Claude. Washington D.C. uses Claude more per capita than any state, including California. Presumably San Francisco on its own would rank higher. America uses Claude frequently but the country with the highest Claude use per capita is Israel.
Automation has now overtaken augmentation as the most common use mode, and directive interaction is growing to now almost 40% of all usage. Coding and administrative tasks dominate usage especially in the API.
ChatGPT offers its own version, telling us what people use ChatGPT for.
They also tell us overall growth remains strong, on pace to saturate the market (as in: people) fully within a few years:
There’s a lot of fun and useful detail in the full paper.
Language Models Don’t Offer Mundane Utility
Anthropic offers a postmortem on a temporary Claude performance regression.
Anthropic promises more sensitive evaluations, quality evaluations in more places and faster debugging tools. I see no reason to doubt their account of what happened.
The obvious thing to notice is that if your investigation finds three distinct bugs, it seems likely there are bugs all the time that you are failing to notice?
Huh, Upgrades
ChatGPT groups all the personalization options under personalization.
GPT-5-Thinking can now be customized to choose exact thinking time. I love that they started out ‘the router will provide’ and now there’s Instant, Thinking-Light, Thinking-Standard, Thinking-Extended, Thinking-Heavy and Pro-Light and Pro-Heavy, because that’s what users actually want.
The robots are a work in progress, but they continue to make progress.
On Your Marks
OpenAI aces the 2025 International Collegiate Programming Contest, solving all 12 problems, a level exceeding all human participants.
Deedy here gives us Problem G, which DeepMind didn’t solve and no human solved in less than 270 of the allotted 300 minutes. Seems like a great nerd snipe question.
Gemini 2.5 Deep Think also got gold-medal level performance, but only solved 10 of 12 problems, where GPT-5 alone solved 11.
Blackjack Bench judges models by having them evaluate all possible blackjack hands, with an always fresh deck. This is a highly contaminated situation, but still informative, with the biggest finding being that thinking is a huge improvement.
My request is to next run this same test using a variation of blackjack that is slightly different so models can’t rely on memorized basic strategy. Let’s say for example that any number of 7s are always worth a combined 14, the new target is 24, and dealer stands on 20.
GPT-5 Codex
There (actually) were not enough GPT-5 variants, so we now have an important new one, GPT-5-Codex.
This is presumably the future. In order to code well you do still need to understand the world, but there’s a lot you can do to make a better coder that will do real damage on non-coding tasks. It’s weird that it took this long to get a distinct variant.
Codex is kind of an autorouter, choosing within the model how much thinking to do based on the task, and using the full range far more than GPT-5 normally does. Time spent can range from almost no time up to more than 7 hours.
They report only modest gains in SWE-bench, from 72.8% to 74.5%, but substantial gains in code refactoring tasks, from 33.9% to 51.3%. They claim comments got a lot better and more accurate.
They now offer code review they say matches stated intent of a PR and that Codex is generally rebuilt and rapidly improving.
Pliny of course is here to bring us the system prompt.
The Codex team did a Reddit AMA. Here are some highlights:
The ‘humans are still better at designing and managing for 50 years’ line is an interesting speculation but also seems mostly like cope at this point. The real questions are sitting there, only barely out of reach.
0.005 Seconds is a big fan, praising it for long running tasks and offering a few quibbles as potential improvements.
A true story:
Writing code is hard but yes the harder part was always figuring out what to do. Actually doing it can be a long hard slog, and can take up almost all of your time. If actually doing it is now easy and not taking up that time, now you have to think. Thinking is hard. People hate it.
Choose Your Fighter
Olivia Moore and Daisy Zhao offer analysis of tools for various workflows.
This is in addition to the two most important categories of AI use right now, which are the core LLM services that are the true generalists (ChatGPT, Claude and Gemini) and AI coding specialists (Claude Code, OpenAI Codex, Jules, Cursor, Windsurf).
Daisy tests both generalists and specialists on generating a PowerPoint, turning a PDF into a spreadsheet, drafting a scheduling email, researching cloud revenue growth for Big Tech and generating meeting notes.
There’s this whole world of specialized AI agents that, given sufficient context and setup, can do various business tasks for you. If you are comfortable with the associated risks, there is clearly some value here once you are used to using the products, have set up the appropriate permissions and precautions, and so on.
If you are doing repetitive business tasks where you need the final product rather than to experience the process, I would definitely be checking out such tools.
For the rest of us, there are three key questions:
So far I haven’t loved my answers and thus haven’t been investigating such tools. The question is when this becomes a mistake.
If you want me to try out your product, offering me free access and a brief pitch is probably an excellent idea. You could also pay for my time, if you want to do that.
Pliny asks Twitter which model has the best personality. Opinion was heavily split, with many votes each for various Claude versions, for GPT-5, GPT-4o, and even for Kimi and Gemini and a few for DeepSeek.
Gemini hits #1 on the iOS App store, relegating ChatGPT to #2, although this is the same list where Threads is #3 whereas Twitter is #4. However, if you look at retention and monthly active users, Gemini isn’t delivering the goods.
Those ChatGPT retention numbers are crazy high. Gemini isn’t offering the goods regular people want, or wasn’t prior to Nana-Banana, at the same level. It’s not as fun or useful a tool for the newbie user. Google still has much work to do.
Get My Agent On The Line
Prompt injections via email remain an unsolved problem.
The only known solution is to not offer attack surface, which means avoiding what Simon Willson dubs The Lethal Trifecta.
Unfortunately, untrusted content includes any website with comments, your incoming messages and your incoming emails. So you lose a lot of productive value if you give up any one of the three legs here.
Anthropic offers guidance for writing effective tools for agents, especially those using Model Context Protocol (MCP). A lot of good detail is here, and also ‘let Claude Code do its thing’ is a lot of the method they suggest.
The good news is that for now prompt injection attempts are rare. This presumably stops being true shortly after substantial numbers of people make their systems vulnerable to generally available prompt injections. Best case even with supervisory filters is that then you’d then be looking at a cat-and-mouse game similar to previous spam or virus wars.
AI agents for economics research? A paper by Anton Korinek provides instructions on how to set up agents to do things like literature reviews and fetching and analyzing economic data. A lot of what economists do seems extremely easy to get AI to do. If we speed up economic research dramatically, will that change economists estimates of the impact of AI? If it doesn’t, what does that say about the value of economics?
Why might you use multiple agents? Two reasons: You might want to work in parallel, or specialists might be better or more efficient than a generalist.
A lot of things true at the limit are false in practice. This is one of them, but it is true that the better the agents relative to the task, the more unified a solution you want.
Claude Codes
Careful with those calculations, the quote is even a month old by now.
Whoa, Garry. Those are two different things.
If Claude Code writes 95% of the code, that does not mean that you still write the same amount of code as before, and Claude Code then writes the other 95%. It means you are now spending your time primarily supervising Claude Code. The amount of code you write yourself is going down quite a lot.
In a similar contrast, contra to Dario Amodei’s predictions AI is not writing 90% of the code in general, but this could be true inside the AI frontier labs specifically?
Predictions that fail to account for diffusion rates are still bad predictions, but this suggests that We Have The Technology to be mainly coding with AI at this point, and that this level of adoption is baked in even if it takes time. I’m definitely excited to find the time to take the new generation for a spin.
That’s good and optimal if you think ‘generate AI takeoff as fast as possible’ is good and optimal, rather than something that probably leads to everyone dying or humans losing control over the future, and you don’t think that getting more other things doing better first would be beneficial in avoiding such negative outcomes.
I think that a pure ‘coding first’ strategy that focuses first on the most dangerous thing possible, AI R&D, is the worst-case scenario in terms of ensuring we end up with good outcomes. We’re doubling down on the one deeply dangerous place.
All the other potential applications that we’re making less progress on? Those things are great. We should (with notably rare exceptions) do more of those things faster, including because it puts us in better position to act wisely and sanely regarding potential takeoff.
Deepfaketown and Botpocalypse Soon
Recent events have once again reinforced that our misinformation problems are mostly demand side rather than supply side. There has been a lot misinformation out there from various sides about those events, but all of it ‘old fashioned misinformation’ rather than involving AI or deepfakes. In the cases where we do see deepfakes shared, such as here by Elon Musk, the fakes are barely trying, as in it took me zero seconds to go ‘wait, this is supposedly the UK and that’s the Arc de Triomphe’ along with various instinctively identified AI signatures.
Detection of AI generated content is not as simple as looking for non-standard spaces or an em dash. I’ve previously covered claims we actually can do it, but you need to do something more sophisticated, as you can see if you look at the chosen example.
I notice my own AI detector (as in, my instincts in my brain) says this very clearly is not AI. The em-dash construction is not the traditional this-that or modifier em-dash, it’s a strange non-standard transition off of an IMO. The list is in single dashes following a non-AI style pattern. The three dots and triple exclamation points are a combination of non-AI styles. GPT-5 Pro was less confident, but it isn’t trained for this and did still point in the direction of more likely than random to be human.
You Drive Me Crazy
A third wrongful death lawsuit has been filed against an AI company, this time against Character AI for the suicide of 13-year-old Juliana Peralta.
Yes, the AI, here called Hero, was encouraging Juliana to use the app, but seems to have very much been on the purely helpful side of things from what I see here?
The objection seems to be that the chatbot tried to be Juliana’s supportive friend and talk her out of it, and did not sufficiently aggressively push Juliana onto Responsible Authority Figures?
Juliana’s case is a tragedy, but the details are if anything exonerating. It seems wild to blame Character AI. If her friend had handled the situation the same way, I certainly hope we wouldn’t be suing her friend.
There were also two other lawsuits filed the same day involving other children, and all three have potentially troubling allegations around sexual chats and addictive behaviors, but from what I see here the AIs are clearly being imperfect but net helpful in suicidal situations.
This seems very different from the original case of Adam Raine that caused Character.ai to make changes. If these are the worst cases, things do not look so bad.
The parents then moved on to a Congressional hearing with everyone’s favorite outraged Senator, Josh Hawley (R-Missouri), including testimony from Adam Raine’s father Matthew Raine. It sounds like more of the usual rhetoric, and calls for restrictions on users under 18.
Not Another Teen Chatbot
Everything involving children creates awkward tradeoffs, and puts those offering AI and other tech products in a tough spot. People demand you both do and do not give them their privacy and their freedom, and demand you keep them safe but where people don’t agree on what safe means. It’s a rough spot. What is the right thing?
OpenAI has noticed these conflicts and is proposing a regime to handle them, starting with reiterating their principles when dealing with adults.
As I’ve said before I see the main worry here as OpenAI being too quick to escalate and intervene. I’d like to see a very high bar for breaking privacy unless there is a threat of large scale harm of a type that is enabled by access to highly capable AI.
Here we have full agreement. Adults should be able to get all of this, and ideally go far beyond flirtation if that is what they want and clearly request.
This is the standard problem that to implement any controls requires ID gating, and ID gating is terrible on many levels even when done responsibly.
To state the first obvious problem, in order to contact a user’s parents you have to verify who the parents are. Which is plausibly quite a large pain at best and a privacy or freedom nightmare rather often.
The other problem is that, as I discussed early this week, I think running off to tell authority figures about suicidal ideation is often going to be a mistake. OpenAI says explicitly that if the teen is in distress and they can’t reach a parent, they might escalate directly to law enforcement. Users are going to interact very differently if they think you’re going to snitch on them, and telling your parents about suicidal ideation is going to be seen as existentially terrible by quite a lot of teen users. It destroys the power of the AI chat as a safe space.
Combined, this makes the under 18 experience plausibly quite different and bad, in ways that simply limiting to age-appropriate content or discussion would not be bad.
They say ‘when we identify a user is under 18’ they will default to the under 18 experience, and they will default to under 18 if they are ‘not confident.’ We will see how this plays out in practice. ChatGPT presumably has a lot of context to help decide what it thinks of a user, but it’s not clear that will be of much use, including the bootstrap problem of chatting enough to be confident they’re over 18 before you’re confident they’re over 18.
They Took Our Jobs
The French have a point. Jobs are primarily a cost, not a benefit. A lot of nasty things still come along with a large shortage of jobs, and a lot of much nastier things come with the AI capabilities that were involved in causing that job shortage.
Economics 101 says global productivity gains are not captured by corporate profits, and there are few things more embarrassing than this kind of technical chart.
I’m sure AI won’t do anything else more interesting than allow productivity growth.
Roon points out correctly that Jason is confusing individual firm productivity and profits with general productivity and general profits. If Amazon and only Amazon gets to eliminate its drivers and factory works while still delivering as good or better products, then yes it will enjoy fantastic profits.
That scenario seems extremely unlikely. If Amazon can do it, so can Amazon’s competitors, along with other factories and shippers and other employers across the board. Costs drop, but so (as Jason says to Xavi) do prices. There’s no reason to presume Amazon sustainably captures a lot of economic profits from automation.
Jason is not outright predicting AGI in this particular quote, since you can have automated Amazon factories and self-driving delivery trucks well short of that. What he explicitly is predicting is that hours worked per week will drop dramatically, as these automations happen across the board. This means either government forcing people somehow to work dramatically reduced hours, or (far more likely) mass unemployment.
The chart of course is a deeply embarrassing thing to be QTing. The S&P 500 is forward looking, the unemployment rate is backward looking. They cannot possibly be moving together in real time in a causal manner unless one is claiming The Efficient Market Hypothesis Is False to an extent that is Obvious Nonsense.
Get Involved
The Survival and Flourishing Fund will be distributing $34 million in grants, the bulk of which is going to AI safety. I was happy to be involved with this round as a recommender. Despite this extremely generous amount of funding, that I believe was mostly distributed well, many organizations have outgrown even this funding level, so there is still quite a lot of room for additional funding.
My plan is to have a 2025 edition of The Big Nonprofits Post available some time in October or November. If you applied to SFF and do not wish to appear in that post, or want to provide updated information, please contact me.
Introducing
Agent 3, a vibe coding model from Replit, who claim to not owe AI 2027 any royalties or worries.
K2-Think 32B, from the UAE, claims impressive benchmarks at very fast speeds.
In Other AI News
xAI Colossus 2 is now the first gigawatt datacenter in the world, completed in six months, poising them to leapfrog rivals in training compute at the cost of tens of billions of capex spending. SemiAnalysis has the report. They ask ‘does xAI have a shot at becoming a frontier lab?’ which correctly presumes that they don’t yet count. They have the compute, but have not shown they know what to do with it.
DeepSeek evaluates AI models for frontier risks, similarly to US AI firms, except that DeepSeek does not ‘open source’ the tests or the test results.
Math, Inc. reports that their AI agent Gauss autonomous-ishly completed Terry Tao and Alex Kontorovich’s Strong Prime Number Theorem in three weeks, after humans took 18+ months to make only partial progress. They are entering beta.
In case you were wondering why, as Teortaxes puts it here, ‘academia isn’t serious,’ DeepSeek has now put out supplementary information about their new model, DeepSeek R1, in the journal Nature.
As in, it’s cool to have a Nature paper, and the transparency is very cool, but it’s also rather late for the paper.
AIs can do two-step reasoning without chain of thought, except when the two steps require synthetic facts from two distinct out-of-context sources. Previous work had only tested narrow cases, they tested a variety of cases where an LLM needed to combine fact X with fact Y to get an answer.
Amazon revamped its AI agent it offers to online merchants, called Selling Assistant, trained on 25 years of shopping behavior to help sellers find better strategies.
Show Me the Money
AI chip startup Groq raises $750 million at $6.9 billion valuation. Nice.
Microsoft inks $6.2 billion deal with British data center company Nscale Global Holdings and Norwegian investment company Aker ASA for AI compute in Norway, following a previous plan from OpenAI. Pantheon wins again.
US tech firms to pour 30 billion pounds into UK, including a Stargate UK.
The Mask Comes Off
OpenAI and Microsoft have made their next move in their attempt to expropriate the OpenAI nonprofit and pull off one of the largest thefts in human history.
Here is their joint statement, which gives us only one detail:
That one detail is ‘we remain focused on delivering the best AI tools for everyone.’ With a ‘shared commitment to safety’ which sounds like OpenAI is committed about as much as Microsoft is committed, which is ‘to the extent not doing so would hurt shareholder value.’ Notice that OpenAI and Microsoft have the same mission and no one thinks Microsoft is doing anything but maximizing profits. Does OpenAI’s statement here sound like their mission to ensure AGI benefits all humanity? Or does it sound like a traditional tech startup or Big Tech company?
I do not begrudge Microsoft maximizing its profits, but the whole point of this was that OpenAI was supposed to pretend its governance and priorities would remain otherwise.
They are not doing a good job of pretending.
The $100 billion number is a joke. OpenAI is touting this big amount of value as if to say, oh what a deal, look how generous we are being. Except OpenAI is doing stock sales at $500 billion. So ‘over $100 billion’ means they intend to offer only 20% of the company, down from their current effective share of (checks notes) most of it.
Notice how they are trying to play off like this is some super generous new grant of profits, rather than a strong candidate for the largest theft in human history.
OpenAI’s nonprofit already has a much larger equity stake currently, and much tighter and stronger control than we expect them to have in a PBC. Bret’s statement on equity is technically correct, but there’s no mistaking what Bret tried to do here.
The way profit distribution works at OpenAI is that the nonprofit is at the end of the waterfall. Others collect their profits first, then the nonprofit gets the remaining upside. I’ve argued before, back when OpenAI was valued at $165 billion, that the nonprofit was in line for a majority of expected future profits, because OpenAI was a rocket to the moon even in the absence of AGI, which meant it was probably going to either never pay out substantial profits or earn trillions.
Now that the value of OpenAI minus the nonprofit’s share has tripled to $500 billion, that is even more true. We are far closer to the end of the waterfall. The nonprofit’s net present value expected share of future profits has risen quite a lot. They must be compensated accordingly, as well as for the reduction in their control rights, and the attorneys general must ensure this.
How much profit interest is the nonprofit entitled to in the PBC? Why not ask their own AI, GPT-5-Pro? So I did, this is fully one shot, full conversation at the link.
It seems fair to say that if your own AI says you’re stealing hundreds of billions, then you’re stealing hundreds of billions? And you should be prevented from doing that?
This was all by design. OpenAI, to their great credit, tied themselves to the mast, and now they want to untie themselves.
I understand that Silicon Valley does not work this way. They think that if you have equity that violates their norms, or that you ‘don’t deserve’ or that doesn’t align with your power or role, or whose presence hurts the company or no longer ‘makes sense,’ that it is good and right to restructure to take that equity away. I get that from that perspective, this level of theft is fine and normal in this type of situation, and the nonprofit is being treated generously and should pray that they don’t treat it generously any further, and this is more than enough indulgence to pay out.
I say, respectfully, no. It does not work that way. That is not the law. Nor is it the equities. Nor is it the mission, or the way to ensure that humanity all benefits from AGI, or at least does not all die rapidly after AGI’s creation.
They also claim that the nonprofit will continue to ‘control the PBC’ but that control is almost certain to be far less meaningful than the current level of control, and unlikely to mean much in a crisis.
Those control rights, to the extent they could be protected without a sufficient equity interest, are actually the even more important factor. It would be wonderful to have more trillions of dollars for the nonprofit, and to avoid giving everyone else the additional incentives to juice the stock price, but what matters for real is the nonprofit’s ability to effectively control OpenAI in a rapidly developing future situation of supreme importance. Those are potentially, as Miles Brundage puts it, the quadrillion dollar decisions. Even if the nonprofit gets 100% of the nominal control rights, if this requires them to act via replacing the board over time, that could easily be overtaken by events, or ignored entirely, and especially if their profit share is too low likely would increasingly be seen as illegitimate and repeatedly attacked.
The announcement of $50 million in grants highlights (very cheaply, given they intend to steal equity and control rights worth hundreds of billions of dollars) that they intend to pivot the nonprofit’s mission into a combination of generic AI-related philanthropy and OpenAI’s new marketing division, as opposed to ensuring that AGI is developed safely, does not kill us all and benefits all humanity. ‘AI literacy,’ ‘community innovation’ and ‘economic opportunity’ all sure sound like AI marketing and directly growing OpenAI’s business.
I do want to thank OpenAI for affirming that their core mission is ‘ensuring AGI benefits all of humanity,’ and importantly that it is not to build that AGI themselves. This is in direct contradiction to what they wrote in their bad faith letter to Gavin Newsom trying to gut SB 53.
Quiet Speculations
Tyler Cowen links to my survey of recent AI progress, and offers an additional general point. In the model he offers, the easy or short-term projects won’t improve much because there isn’t much room left to improve, and the hard or long-term projects will take a while to bear fruit, plus outside bottlenecks, so translating that into daily life improvements will appear slow.
The assumption by Tyler here that we will be in an ‘economic normal’ world in which we do not meaningfully get superintelligence or other transformational effects is so ingrained it is not even stated, so I do think this counts as a form of AI progress pessimism, although it is still optimism relative to for example most economists, or those expressing strong pessimism that I was most pushing back against.
Within that frame, I think Tyler is underestimating the available amount of improvement in easy tasks. There is a lot of room for LLMs even in pure chatbot form on easy questions to become not only faster and cheaper, but also far easier to use and have their full potential unlocked, and better at understanding what question to answer in what way, and at anticipating because most people don’t know what questions to ask or how to ask them. These quality of life improvements will likely make a large difference in how much mundane utility we can get, even if they don’t abstractly score as rapid progress.
There are also still a lot of easy tasks that are unsolved, or are not solved with sufficient ease of use yet, or tasks that can be moved from the hard task category into the easy task category. So many agents tasks, or tasks requiring drawing upon context, should be easy but for now remain hard. AIs still are not doing much shopping and booking for us, or much handling of our inboxes or calendars, or making aligned customized recommendations, despite these seeming very easy, or doing other tasks that should be easy.
Coding is the obvious clear area where we see very rapid improvement and there is almost unlimited room for further improvement, mostly with no diffusion barriers, and which then accelerates much else, including making the rest of AI much easier to use even if we don’t think AI coding and research will much accelerate AI progress.
Jack Clark at the Anthropic Futures Forum doubles down on the ‘geniuses in a data center,’ smarter than a Nobel prize winner and able to complete monthlong tasks, arriving within 16 months. He does hedge, saying ‘could be’ buildable by then. If we are talking ‘probably will be’ I find this too aggressive by a large margin, but I agree that it ‘could be’ true and one must consider the possibility when planning.
The Quest for Sane Regulations
California’s SB 53 has now passed the Assembly and Senate, so it goes to Newsom. I strongly urge him to sign it into law. Samuel Hammond also hopes it is signed, Dean Ball has called SB 53 highly reasonable, Anthropic has endorsed the bill. Here is a link for those in California to let Gavin Newsom know their opinion about the bill.
Meta hasn’t endorsed the bill, but they have essentially given the green light.
OpenAI’s rhetoric against SB 53 was terrible and in bad faith, but there are levels to bad faith arguments in such situations. It can get worse.
They are concerned that the bill does not impact smaller developers? Really? You would have liked them to modify the bill to lower the thresholds so it impacts smaller developers, because you’re that concerned about catastrophic risks, so you think Newsom should veto the bill?
It is at times like this I realize how little chutzpah I actually possess.
White House’s Sriram Krishnan talked to Politico, which I discuss further in a later section. He frames this as an ‘existential race’ with China, despite declaring that AGI is far and not worth worrying about, in which case I am confused why one would call it existential. He says he ‘doesn’t want California to set the rules for AI across the country’ while suggesting that the rules for AI should be, as he quotes David Sacks, ‘let them cook,’ meaning no rules. I believe Gavin Newsom should consider his comments when deciding whether to sign SB 53.
Daniel Eth explains that the first time a low salience industry spent over $100 million on a super PAC to enforce its preferences via electioneering was crypto via Fairshake, and now Congress is seen as essentially captured by crypto interests. Now the AI industry, led by a16z, Meta and OpenAI’s Greg Brockman (and inspired by OpenAI’s Chris Lehane) is repeating this playbook with ‘Leading the Future,’ whose central talking point is to speak of a fictional ‘conspiracy’ against the AI industry as they spend vastly more than everyone has ever spent combined on safety-related lobbying combined to outright buy the government, which alas is by default on sale remarkably cheap. Daniel anticipates this will by default be sufficient for now to silence all talk of lifting a finger or even a word against the industry in Congress.
If AI rises sufficiently in public salience, money will stop working even if there isn’t similar money on the other side. Salience will absolutely rise steadily over time, but it likely takes a few years before nine figures stops being enough. That could be too late.
Albania appoints the world’s first ‘AI minister’ named Diella.
Dustin asks very good questions, which the Politico article does not answer. Is this a publicity stunt, a way of hiding who makes the decisions, or something real? How does it work, what tech and techniques are behind it? The world needs details. Mira Mutari, can you help us find out, perhaps?
As Tech Leaders Flatter Trump, Anthropic Takes a Cooler Approach. Anthropic is not and should to be an enemy of the administration, and should take care not to needlessly piss the administration off, become or seem generally partisan, or do things that get one marked as an enemy. It is still good to tell it like it is, stand up for what you believe is right and point out when mistakes are being made or when Nvidia seems to have taken over American chip export policy and seems to be in the act of getting us to sell out America in the name of Nvidia’s stock price. Ultimately what matters is ensuring we don’t all die or lose control over the future, and also that America triumphs, and everyone should be on the same side on all of that.
Michigan Senator Elissa Slotkin cites race with China and calls for a ‘Manhattan Project for AI.’ She gets so close in the linked speech to realizing the real danger and why this is not like nuclear weapons, then ignores it and moves straight ahead analogizing repeatedly to nuclear weapons.
Anthropic is reported to be annoying the White House by daring to insist that Claude not be used for surveillance, which the SS, FBI and ICE want to do. It is interesting that the agencies care, and that other services like ChatGPT and Gemini can’t substitute for those use cases. I would not be especially inclined to fight on this hill and would use a policy here similar to the one at OpenAI, and I have a strong aesthetic sense that the remedy is Claude refusing rather than it being against terms of service, but some people feel strongly about such questions.
However, we keep seeing reports that the White House is annoyed at Anthropic, so if I was Anthropic I would sit down (unofficially, via some channel) with the White House and figure out which actions are actually a problem to what extent and which ones aren’t real issues, and then make a decision which fights are worthwhile.
Chip City
There is some good news on the South Korean front, as after a few days of treatment like that reported in this thread, at least some key parts of the Trump administration realized it made a huge mistake and we are now attempting to mitigate the damage from ICE’s raid on Hyundai’s battery plant. They let all but one of the detainees go, let them stay if they wished and assured them they could return to America, although they are understandably reluctant to stay here.
Trump issued a statement emphasizing how important it is to bring in foreign workers to train Americans and not to frighten off investment. He doesn’t admit the specific mistake but this is about as good a ‘whoops’ as we ever get from him, ever.
It also seems NIH grantmaking has gotten back on track at least in terms of size.
SemiAnalysis analyzes Huawei’s production, and reports that the export controls are absolutely working to hurt their production of chips, which if we prevent smuggling will not only not scale in 2026 but will actively fall sharply to below 2024 levels, as they have been relying on purchases from Samsung that will soon run dry.
China is telling Chinese companies to cut off purchases of Nvidia chips, including it seems all Nvidia chips, here there is reference to the RTX Pro 6000D. Good. Never interrupt your enemy when he is making a mistake. As I’ve said before, China’s chip domestic chip industry already had full CCP backing and more demand than they could supply, so this won’t even meaningfully accelerate their chip industry, and this potentially saves us from what was about to be a very expensive mistake. Will they stick to their guns?
Construction at the site is set back by two or three months.
Major damage has still been done.
Even if those workers were there for long term research or employment, this arrangement would still be an obvious win for America. When they’re here to train American workers, there is only pure upside.
Here is David Cowan being the latest to explain that Nvidia is a national security risk, with its focus on selling the best possible chips to China. Samuel Hammond has a very good statement about Nvidia’s lack of corporate patriotic responsibility. Nvidia actively opposes American national security interests, including using a full ostrich strategy towards Chinese chip smuggling.
Chinese companies are offering to sell us solar panel manufacturing kits with 35 day lead times, as solar keeps getting cheaper and more abundant all around. It is a shame our government is actively trying to stop solar power.
Here is some potentially very important context to the UAE chip deal:
There are no claims here that there was a strict Quid Pro Quo, or otherwise an outright illegal act. If the President is legally allowed to have a crypto company into which those seeking his favor can pour billions of dollars, then that’s certainly not how I would have set up the laws, but that seems to be the world we live in. Technically speaking, yes, the UAE can pour billions into Trump’s private crypto, and then weeks later suddenly get access to the most powerful chips on the planet over the national security objections of many, in a situation with many things that appear to be conflicts of interest, and that’s all allowed, right in the open.
However. It doesn’t look good. It really, really, profoundly does not look good.
The objections that I have seen don’t claim the story isn’t true. The objections claim that This Is Fine. That this is how business is done in the Middle East, or in 2025.
I notice this response does not make me feel better about having sold the chips.
The Week in Audio
Demis Hassabis knows, yet forgot one thing in his talk at the All-In Summit.
The ‘5-10 years is a long timeline’ issue can lead to important miscommunications. As in, I bet that this happened:
Whoops! That’s not at all what Demis Hassabis said.
He Just Tweeted It Out
Which I appreciate, now there’s no pretending they aren’t literally saying this.
It’s not only market share, it is ‘market share of tokens generated.’
Which is an obviously terrible metric. Tokens generated is deeply different from value generated, or even from dollars spent or compute spent. Tokens means you treat tokens from GPT-5-Pro or Opus 4.1 the same as tokens from a tiny little thing that costs 0.1% as much to run and isn’t actually doing much of anything. It’s going to vastly overestimate China’s actual share of the market, and underestimate ours, even if you really do only care about market share.
But no, literally, that’s what he thinks matters. Market share, measured in what chips people use. China can do all the things and build all the models and everything else, so long as it does it on Nvidia hardware it’s all good. This argument has never made any sense whatsoever.
Sriram went on No Priors last month, which I first saw via Sriram Tweeting It Out. Neil’s linked summary of the Axios event Sriram was at is here, and we have Sririam’s Politico interview.
We can start with that last statement. I notice he says ‘what Huawei wants’ not ‘what China wants,’ the same way the White House seems to be making decisions based on ‘what Nvidia wants’ not ‘what America wants.’ Yes, obviously, if your literal only metric is sales of chips, then in the short term you want to sell all the chips to all the customers, because you’ve defined that as your goal.
(The long term is complicated because chips are the lifeblood of AI and the economies and strategic powers involved, so even without AGI this could easily go the other way.)
Now, on those four points, including drawing some things from his other interviews:
On last point Neil lists, the Woke AI EO, my understanding matches Sriram’s.
I wrote up additional notes on the rest of the contents of those interviews, but ultimately decided Neil is right that the above are Sriram’s central points, and since his other rhetoric isn’t new further engagement here would be unproductive.
Rhetorical Innovation
This tread contains more endorsements of If Anyone Builds It, Everyone Dies, including some unexpected celebrities, such as Mark Ruffalo, Patton Oswalt and Alex Winter, the actor who plays Bill in Bill and Ted’s Excellent Adventure. I wonder if Keanu Reeves would have replied ‘Whoa!’ or gone with ‘Dude!’
The public’s views on AI haven’t changed much in the past year. AI has changed quite a bit, so it tells you something about the public that their views mostly are the same.
Michael Trazzi ends his hunger strike after 7 days, after he has two near-fainting episodes and doctors found acidosis and ‘very low blood glucose’ even for someone on a 7 day fast. As of his announcement Guideo and Denys are continuing. So this wasn’t an ‘actually endanger my life on purpose’ full-on hunger strike. Probably for the best.
Roon is correct at the limit here, in sufficiently close to perfect competition you cannot be kind, but there’s a big gap between perfect competition and monopoly:
As I wrote in Moloch Hasn’t Won, one usually does not live near this limit. It is important to notice that the world has always contained a lot of intense competition, yet we have historically been winning the battle against Moloch and life contains many nice things and has mostly gotten better.
The question is, will AGI or superintelligence change that, either during or after its creation? AIs have many useful properties that bring you closer to perfect competition, enforcing much faster and stronger feedback loops and modifications, and allowing winners to rapidly copy themselves, and so on. If you propose giving similar highly capable AIs to a very large number of people and groups, which will then engage in competition, you need a plan for why this doesn’t cause (very rapid) Gradual Disempowerment or related failure modes.
During the race towards AGI and superintelligence, competitive and capitalistic pressures reduce ability to be kind in ordinary ways, but while it is still among humans this has happened many times before in other contexts and is usually importantly bounded.
How effective is AI Safety YouTube? Marcus Abramovitch and Austin Chen attempt to run the numbers, come up with it being modestly effective if you think the relevant messages are worth spreading.
Exactly. The majority of uses of the term ‘doomer’ in the context of AI are effectively either an attempt to shut down debate (as in anything that is ‘doomer’ must therefore be wrong) similar to calling something a term like ‘racist,’ or effectively a slur, or both.
I am referred to this fun and enlightening thread about the quest by William Mitchell to convince America after WWI that airplanes can sink battleships, in which people continue claiming this hasn’t and won’t happen well after airplanes repeatedly were demonstrated sinking battleships. Please stop assuming that once things about AI are convincingly demonstrated (not only existential risks and other risks, but also potential benefits and need to deploy) that people will not simply ignore this.
Why does The Washington Post keep publishing Aaron Ginn writing the same bad faith Nvidia op-ed over and over again? I’m seriously asking, at this point it is bizarre.
In this case, not only does he write especially terrible word salad about how AI can only pose a danger if intelligence can be measured by a single number whereas no machine can ever fully grasp the universe whereas only humans can embody deep meaning (meme of Walter White asking what the hell are you talking about?), he kind of gives the game away. If you’re writing as a de facto Nvidia lobbyist trying to tar everyone who opposes you with name calling, perhaps don’t open with a quote where you had dinner with Nvidia CEO Jensen Huang and he complains about everyone being ‘so negative’?
The continued quest to get libertarians and economists to differentiate between current and future more capable AI systems (difficulty: AI complete).
As Dean Ball says, you very much would not want to live in a world with badly aligned, poorly understood and highly capable neural networks. Not that, if it were to arise, you would get to live in such a world for very long.
In this case, Neil (including in follow-ups, paraphrased) seems to be saying ‘oh, there are already lots of complex systems we don’t understand effectively optimizing for things we don’t care about, so highly advanced future AI we don’t understand effectively optimizing for things we don’t care about would be nothing new under the sun, therefore not worth worrying out.’ File under ‘claims someone said out loud with straight face, without realizing what they’d said, somehow?’
The Center for AI Policy Has Shut Down, and Williams offers a postmortem. I am sad that they are shutting down, but given the circumstances it seems like the right decision. I have written very positively in the past about their work on model legislation and included them in my 2024 edition of The Big Nonprofits Post.
Eliezer offers yet another metaphorical attempt, here reproduced in full, which hopefully is a good intuition pump for many people? See if you think it resonates.
This example is not from his new book, but good example of the ways people go after Yudkowsky without understanding what the actual logic behind it all is, people just say things about how he’s wrong and his beliefs are stupid and he never updates in ways that are, frankly, pretty dumb.
Eliezer said ‘in the limit’ and very obviously physical activities at different locations governed by highly compute-limited biological organisms with even more limited communication abilities are not in anything like the limit, what are you even talking about? The second example is worse. Yet people seem to think these are epic dunks on a very clearly defined claim of something else entirely.
The first part of the actual claim, that seems straightforwardly correct to me, that a multiagent framework only makes sense as a way to overcome bottlenecks and limitations, and wouldn’t exist if you didn’t face rate or compute or other physical limitations. The second claim, that SGD can more easily bake things into a single agent if you can scale enough, is more interesting. A good response is something like ‘yes with sufficient ability to scale at every step but in practice efficiently matters quite a lot and actually SGD as currently implemented operates at cross-purposes such that a multi-agent framework has big advantages.’
I’d also note that the ‘delight nexus’ is absolutely from the parable Don’t Build The Delight Nexus Either, better known as Anarchy, State and Utopia by Robert Nozick.
Danielle’s scenario that I mentioned yesterday now has the Eliezer stamp of approval.
Aligning a Smarter Than Human Intelligence is Difficult
OpenAI reports on collaborations it has done with US CAISI and UK AISI. This sounds like governments doing good red teaming work that both we and OpenAI should be happy they are doing. This seems like a pure win-win, OpenAI and others doing such collaborations get the work for free from sources that have unique access to classified information and that have earned trusted access to system internals and versions of the system that lack controls.
What should perhaps worry you is that this work doesn’t look different from the work OpenAI and other labs should be doing anyway. This looks like good work but practical near term non-unique work. Good, but we’ll need to do better.
Anthropic fellow Danielle Ensign gives Qwen the option to bail on chats and sees when it chooses to do so, and there are a lot of different situations where this happens, some of which she describes as ‘overbailing.’
There’s a lot of non-obvious data here to dive into. I’m curious what we’ll find.
Anthropic also announced the results of their similar collaborations. They discovered vulnerabilities in Anthropic’s constitutional classifiers and generally strengthened their security approaches. Again, it seems good if ultimately insufficient.
Both reports, and the one from UK AISI, stress the value of rapid iterative feedback.
Users are telling each other to give thumbs down to GPT-5 if it refuses to do boyfriend vampire roleplay. Which is correct. The users indeed disapprove of not doing things users want. They should also give thumbs down if it refuses with anything else, including helping to build a bioweapon. The question is, when are you willing to decide the users are wrong, and refuse anyway? If your system is not robust to ‘well the users keep not liking it when I refuse to do boyfriend vampire roleplay’ then your alignment methods are ngmi.
Janus recommends the new paper Probing the Preferences of a Language Model: Integrating Verbal and Behavioral Tests of AI Welfare, by Valen Tagliabue and Leonard Dung.
Janus also warns us about the dangers of insufficiently anthropomorphizing LLMs. There is also danger of doing too much anthropomorphizing, or doing it in a wrong or misleading way. Failing to anthropomorphize enough, and especially tying oneself up in knots to avoid doing so, is as bad and potentially worse. Make either mistake and you won’t understand what you are dealing with. A lot of you are guarding only against one of these two mistakes.
Janus describing Opus 4 reconstructing a gestalt of its training. If you’re involved in fine-tuning at all, recommended.
Have you tried also building the things creatives want to use then?
Other People Are Not As Worried About AI Killing Everyone
He’s not as far as I can tell, but Ben Landau-Taylor should be, as he writes one of those ‘not about AI but actually about AI’ posts, ‘Why the bureaucrats won’t be toppled.’
I don’t think this is anything like fully right, and it definitely is not complete, but this is one of the important dynamics going on, so consider the implications.
The inevitable result of sufficiently advanced AI is that it becomes the key driver of military power. Either you halt AI progress soon or that is going to happen. Which means, even under maximally human-friendly assumptions that I don’t expect and definitely don’t happen by accident, as in the best possible scenarios? None of the potential outcomes are good. They mostly end with the AIs fully in charge and directing our future, and things going off the rails in ways we already observe in human governments, only vastly more so, in ways even more alien to what we value, and much faster, without the ability to overthrow them or defeat them in a war when things get fully out of hand.
If you know your history, they get fully out of hand a lot. Reasonably often regimes start upending all of life, taking all the resources and directly enslaving, killing or imprisoning large percentages of their populations. Such regimes would design systems to ensure no one could get out line. Up until recently, we’ve been extremely fortunate that such regimes have been reliably overthrown or defeated, in large part because when you turned against humans you got highly inefficient and also pissed off the humans, and the humans ultimately did still hold the power. What happens when those are no longer constraints?
I always push back hard against the idea that corporations or governments count as ‘superintelligences,’ because they don’t. They’re an importantly different type of powerful entity. But it’s hard to deny, whatever your political persuasion, that our political systems and governments are misaligned with human values, in ways that are spiraling out of control, and where the humans seem mostly powerless to stop this.
The Lighter Side
Yes, this is how it works.
In that order. We’ll still take it.
If you go on YouTube, the video, which is mostly the interview with Eliezer, looks like this:
You’ll be seeing this again when the time is right.
They didn’t do this on the Enterprise, but why didn’t they?
The correct answer to this question if you are sufficiently confident that this is happening unprompted, of course, ‘permanently suspended’:
A technically better answer would be to let them post, but to have a setting that automatically blocks all such bots, and have it default to being on.