Expressive Mode for ElevenAgents. It detects and responds to your emotional expression. It’s going to be weird when you know the AI is responding to your tone, and you start to choose your tone strategically even more than you do with humans.
... Do people not do that already? LLMs having superhuman truesight, I always check whether I'm writing from the correct mindset when interacting with them, so as to ensure they're not adjusting their response for things I don't want them to adjust for. E. g., if I want an overview of a contentious issue where I'd prefer one of the sides to be correct, I deliberately aim to "fold away" the parts of my mindset that are polarized, and to write purely from a place of curiosity/truth-seeking instead.
IMO, HPMoR-style Occlumency should be a basic tool in your toolbox for LLM interactions.
Today Windows told me I have 18 optional driver updates. I don't think I've every gotten more than ~3 at once. I am assuming coding agents are involved, but can't be sure?
I want to talk about the chess games 30 seconds in.
They have swapped the places of the black's King and Queen
This was the week of Claude Opus 4.6, and also of ChatGPT-5.3-Codex. Both leading models got substantial upgrades, although OpenAI’s is confined to Codex. Once again, the frontier of AI got more advanced, especially for agentic coding but also for everything else.
I spent the week so far covering Opus, with two posts devoted to the extensive model card, and then one giving benchmarks, reactions, capabilities and a synthesis, which functions as the central review.
We also got GLM-5, Seedance 2.0, Claude fast mode, an app for Codex and much more.
Claude fast mode means you can pay a premium to get faster replies from Opus 4.6. It’s very much not cheap, but it can be worth every penny. More on that in the next agentic coding update.
One of the most frustrating things about AI is the constant goalpost moving, both in terms of capability and safety. People say ‘oh [X] would be a huge deal but is a crazy sci-fi concept’ or ‘[Y] will never happen’ or ‘surely we would not be so stupid as to [Z]’ and then [X], [Y] and [Z] all happen and everyone shrugs as if nothing happened and they choose new things they claim will never happen and we would never be so stupid as to, and the cycle continues. That cycle is now accelerating.
As Dean Ball points out, recursive self-improvement is here and it is happening.
So here we are.
Meanwhile, various people resign from the leading labs and say their peace. None of them are, shall we say, especially reassuring.
In the background, the stock market is having a normal one even more than usual.
Even if you can see the future, it’s really hard to do better than ‘be long the companies that are going to make a lot of money’ because the market makes wrong way moves half the time that it wakes up and realizes things that I already know. Rough game.
Table of Contents
Language Models Offer Mundane Utility
Flatter the AI customer service bots, get discounts and free stuff, and often you’ll get to actually keep them.
AI can do a ton even if all it does is make the software we use suck modestly less:
The problem is, you can create all the software you like, they still have to use it.
Language Models Don’t Offer Mundane Utility
Once again, an academic is so painfully unaware or slow to publish, or both, that their testing of LLM effectiveness is useless. This time it was evaluating health advice.
Huh, Upgrades
Anthropic brings a bunch of extra features to their free plans for Claude, including file creation, connectors, skills and compaction.
ChatGPT Deep Research is now powered by GPT-5.2. I did not realize this was not already true. It now also integrates apps in ChatGPT, lets you track progress and give it new sources while it works, and presents its reports in full screen.
OpenAI updates GPT-5.2-Instant, Altman hopes you find it ‘a little better.’ I demand proper version numbers. You are allowed to have a GPT-5.21.
Chrome 146 includes an early preview of WebMCP for your AI agent.
On Your Marks
The most important thing to know about the METR graph is that doubling times are getting faster, in ways people very much dismissed as science fiction very recently.
The measurement is starting to require a better task set, because things are escalating too quickly.
Overcoming Bias
Ivan Arcuschin and others investigate LLMs having ‘hidden biases,’ meaning factors that influence decisions but that are never cited explicitly in the decision process. The motivating example is the religion of a loan applicant. It’s academic work, so the models involved (Gemini 2.5 Flash, Sonnet 4, GPT-4.1) are not frontier but the principles likely still hold.
They find biases in various models including formality of writing, religious affiliation, Spanish language ability and religious affiliation. Gender and race bias, favoring female and minority-associated applications, generalized across all models.
We label only some such biases ‘inappropriate’ and ‘illegal’ but the mechanisms involved are the same no matter what they are based upon.
This is all very consistent with prior findings on these questions.
Choose Your Fighter
This is indeed strange and quirky, but it makes sense if you consider what both companies consider their comparative advantage and central business plan.
One of these strategies seems wiser than the other.
Or so I heard:
Get My Agent On The Line
Memento as a metaphor for AI agents. They have no inherent ability to form new memories or learn, but they can write themselves notes of unlimited complexity.
I wouldn’t call it faking continual learning. If it works it’s continual learning. Yes, actual in-the-weights continual learning done properly would be a big deal and big unlock, but I see this and notes more as substitutes, although they are also compliments. If you can have your notes function sufficiently well you don’t need new memories.
Models are continuously learning in general, in the sense that every few months the model gets better. And if you try to bake other learning into the weights, then every few months you would have to start that process over again or stay one model behind.
I expect ‘continual learning’ to be solved primarily via skills and context, and for this to be plenty good enough, and for this to be clear within the year.
AI Conversations Are Not Privileged
Neither are your Google searches. This is a reminder to act accordingly. If you feed anything into an LLM or a Google search bar, then the government can get at it and use it at trial. Attorneys should be warning their clients accordingly, and one cannot assume that hitting the delete button on the chat robustly deletes it.
AI services can mitigate this a lot by offering a robust instant deletion option, and potentially can get around this (IANAL and case law is unsettled) by offering tools to collaborate with your lawyer to invoke privilege.
Should we change how the law works here? OpenAI has been advocating to make ChatGPT chats have legal privilege by default. My gut says this goes too far in the other direction, driving us away from having chats with people.
Fun With Media Generation
Seedance 2.0 from ByteDance is giving us some very impressive 15 second clips and often one shotting them, such as these, and is happy to include celebrities and such. We are not ‘there’ in the sense that you would choose this over a traditionally filmed movie, but yeah, this is pretty impressive.
AI vastly reduces the cost to producing images and video, for now this is generally at the cost of looking worse. As Andrew Rettek points out it is unsurprising that people will accept a quality drop to get a 100x drop in costs. What is still surprising, and in this way I agree with Andy Masley, is that they would use it for the Olympics introduction video. When you’re at this level of scale and scrutiny you would think you would pay up for the good stuff.
The Superb Owl
We got commercials for a variety of AI products and services. If anything I was surprised we did not get more, given how many AI products offer lots of mundane utility but don’t have much brand awareness or product awareness. Others got taken by surprise.
There were broadly two categories, frontier models (Gemini, OpenAI and Anthropic), and productivity apps.
The productivity app commercials were wild, lying misrepresentations of their products. One told us anyone with no experience can code an app within seconds or add any feature they want. Another closed you and physically walked around the office. A third even gave you the day off, which we all know never happens. Everything was done one shot. How dare they lie to us like this.
I kid, these were all completely normal Super Bowl ads, and they were fine. Not good enough to make me remember which AI companies bought them, or show me why their products were unique, but fine.
We also got one from ai.com.
Okay, look, fair, maybe there’s a little bit of a bubble in some places.
The three frontier labs took very different approaches.
Anthropic said ads are coming to AI, but Claude won’t ever have ads. We discussed this last week. They didn’t spend enough to run the full versions, so the timing was wrong and it didn’t land the same way and it wasn’t as funny as it was online.
On reflection, after seeing it on the big screen, I decided these ads were a mistake for the simple reason that Claude and Anthropic have zero name recognition and this didn’t establish that. You first need to establish that Claude is a ChatGPT alternative on people’s radar, so once you grab their attention you need more of an explanation.
Then I saw one in full on the real big screen, during previews at an AMC, and in that setting it was even more clear that this completely missed the mark and normies would have no idea what was going on, and this wouldn’t accomplish anything. Again, I don’t understand how this mistake gets made.
Several OpenAI people took additional potshots at this and Altman went on tilt, as covered by CNN, but wisely, once it was seen in context, stopped accusing it of being misleading and instead pivoted to correctly calling it ineffective.
It turns out it was simpler than that, regular viewers didn’t get it at all and responded with a lot of basically ‘WTF,’ ranking it in the bottom 3% of Super Bowl ads.
I always wonder, when that happens, why one can’t use a survey or focus group to anticipate this reaction. It’s a mistake that should not be so easy to make.
Anthropic’s secret other ad was by Amazon, for Alexa+, and it was weirdly ambivalent about whether the whole thing was a good idea but I think it kinda worked. Unclear.
OpenAI went with big promises, vibes and stolen (nerd) valor. The theme was ‘great moments in chess, building, computers and robotics, science and science fiction’ to claim them by association. This is another classic Super Bowl strategy, just say ‘my potato chips represent how much you love your dad’ or ‘Dunkin Donuts reminds you of all your favorite sitcoms,’ or ‘Sabrina Carpenter built a man out of my other superior potato chips,’ all also ads this year.
The ideal Super Bowl ad successfully does both, unless you already have full brand recognition and don’t need to explain (e.g. Pepsi, Budweiser, Dunkin Donuts).
On the one hand, who doesn’t love a celebration of all this stuff? Yes, it’s cool to reference I, Robot and Alan Turing and Grace Hopper and Einstein. I guess? On the other hand, it was just an attempt to overload the symbolism and create unearned associations, and a bunch of them felt very unearned.
I want to talk about the chess games 30 seconds in.
This feels like laziness and choosing style over substance, not checking your work.
Then it closes on ‘just build things’ as an advertisement for Codex, which implies you can ‘just build’ things like robots, which you clearly can’t. I mean, no, it doesn’t, this is totally fine, it is a Super Bowl ad, but by their own complaint standards, yes. This was an exercise in branding and vibes, it didn’t work for me because it was too transparent and boring and content-free and felt performative, but on the meta level it does what it sets out to do.
Google went with an ad focusing on personalized search and Nana Banana image transformations. I thought this worked well.
Meta advertised ‘athletic intelligence’ which I think means ‘AI in your smart glasses.’
Then there’s the actively negative one, from my perspective, which was for Ring.
As in, we are proud to tell you we’re watching everything and reporting it to all the law enforcement agencies including ICE, and we are using recognition technology that can differentiate dogs and therefore also people using AI.
But it’s okay, because one a day we find someone’s lost puppy. You should sell your freedom for the rescue of a lost puppy.
No, it’s not snark to call this, as Scott Lincicome said, ‘10 million dogs go missing every year, help us find 365 of them by soft launching the total surveillance state.’
Here’s a cool breakdown of the economics of these ads, from another non-AI buyer.
A Word From The Torment Nexus
Fidji Simo goes on the Access Podcast to discuss the new OpenAI ads that are rolling out. The episode ends up being titled ‘Head of ChatGPT fires back at Anthropic’s Super Bowl attack ads,’ which is not what most of the episode is about.
This picture of the ad is at the end of the multi-screen-long reply.
I would say this is more clearly labeled then an ad on Instagram or Google at this point. So even though it’s not that clear, it’s hard to be too mad about it, provided they stick to the rule that the ad is always at the end of the answer. That provides a clear indicator users can rely upon. If they put this in different places at different times, I would say it is ‘labeled’ at all, but not consider this to then be ‘clearly’ labeled.
OpenAI’s principles for ads are:
Do you trust them to adhere to these principles over time? Do you trust that merely technically, or also in spirit where the model is created and system prompt is adjusted without any thought to maximizing advertising revenue or pleasing advertisers?
You are also given power to help customize what ads you see, as per other tech company platforms.
Roon gives a full-throated defense of advertising in general, and points out that mostly you don’t need to violate privacy to target LLM-associated ads.
It’s not a moral panic. Roon loves ads, but most people hate ads. I agree that people often hate ads too much, they allow us to offer a wide variety of things for free that otherwise would have to cost money and that is great. But they really are pretty toxic, they massively distort incentives, and the amount of time we used to lose to them is staggering.
They Took Our Jobs
Jan Tegze warns that your job really is going away, the AI agents are cheaper and will replace you. Stop trying to be better at your current job and realize your experience is going to be worthless. He says that using AI tools better, doubling down on expertise or trying to ‘stay human’ with soft skills are only stalling tactics, he calls them ‘reactions, not redesigns.’ What you can do is instead find ways to do the new things AI enables, and stay ahead of the curve. Even then, he says this only ‘buys you three to five years,’ but then you will ‘see the next evolution coming.’
Presumably you can see the problem in such a scenario, where all the existing jobs get automated away. There are not that many slots for people to figure out and do genuinely new things with AI. Even if you get to one of the lifeboats, it will quickly spring a leak. The AI is coming for this new job the same way it came for your old one. What makes you think seeing this ‘next evolution’ after that coming is going to leave you a role to play in it?
If the only way to survive is to continuously reinvent yourself to do what just became possible, as Jan puts it? There’s only one way this all ends.
I also don’t understand Jan’s disparate treatment of the first approach that Jan dismisses, ‘be the one who uses AI the best,’ and his solution of ‘find new things AI can do and do that.’ In both cases you need to be rapidly learning new tools and strategies to compete with the other humans. In both cases the competition is easy now since most of your rivals aren’t trying, but gets harder to survive over time.
One can make the case that humans will continue to collectively have jobs, or at least that a large percentage will still have jobs, but that case relies on either AI capabilities stalling out, or on the tricks Jan dismisses, that you find where demand is uniquely human and AI can’t substitute for it.
A basic ‘everything is going to change, AI is going to take over your job, it has already largely taken over mine and AI is now in recursive soft self-improvement mode’ article for the normies out there, written in the style of Twitter slop by Matt Shumer.
Timothy Lee links approvingly to Adam Ozimek and the latest attempt to explain that many jobs can’t be automated because of ‘the human touch.’ He points to music and food service as jobs that could be fully automated, but that aren’t, even citing that there are still 67,500 travel agents and half a million insurance sales agents. I do not think this is the flex Adam thinks it is.
Even if the point was totally correct for some tasks, no, this would not mean that the threat to work is overrated, even if we are sticking in ‘economic normal’ untransformed worlds.
The proposed policy solution, if we get into trouble, is a wage subsidy. I do not think that works, both because it has numerous logistical and incentive problems and because I don’t think there will be that much difference in such worlds in demand for human labor at (e.g.) $20 versus $50 per hour for the same work. Mostly the question will be, does the human add value here at all, and mostly you don’t want them at $0, or if they’re actually valuable then you hire someone either way.
Ankit Maloo enters the ‘why AI will never replace human experts’ game by saying that AI cannot handle adversarial situations, both because it lacks a world model of the humans it is interacting with and the details and adjustments required and because it can be probed, read then then exploited by adversaries. Skill issue. It’s all skill issues. Ankit says ‘more intelligence isn’t the fix’ and yeah not if you deploy that ‘intelligence’ in a stupid fashion but intelligence is smarter than that.
So you get claims like this:
The ‘outsider’ line above is counting on working together with an expert to do the rest of the steps. If the larger system (AI, human or both) is a true outsider, the issue is that it will get the simulations wrong.
This is insightful in terms of why some people think ‘this can do [X]’ and others think ‘this cannot do [X],’ they are thinking of different [X]s. The AI can’t ‘be a lawyer’ in the full holistic sense, not yet, but it can do increasingly many lawyer subtasks, either accelerating a lawyer’s work or enabling a non-lawyer with context to substitute for the lawyer, or both, increasingly over time.
There’s nothing stopping you from creating an agentic workflow that looks like the Expert in the above graph, if the AI is sufficiently advanced to do each individual move. Which it increasingly is or will be.
There’s a wide variety of these ‘the AI cannot and will never be able to [X]’ moves people try, and… well, I’ll be, look at those goalposts move.
Things a more aligned or wiser person would not say, for many different reasons:
What’s it going to take? This is in reference to Claude Code creating a C compiler.
There will always be a next ‘what about,’ right until there isn’t.
Thus, this also sounds about right:
The Art of the Jailbreak
Pliny jailbreaks rarely surprise me anymore, but the new one of Google Translate did. It turns out they’re running Gemini underneath it.
Introducing
GLM-5 from Z.ai, which scales from 355B (32B active) to 744B (40B active). Weights here. Below is them showing off their benchmarks. It gets $4432 on Vending Bench 2, which is good for 3rd place behind Claude and Gemini. The Claude scores are for 4.5.
Expressive Mode for ElevenAgents. It detects and responds to your emotional expression. It’s going to be weird when you know the AI is responding to your tone, and you start to choose your tone strategically even more than you do with humans.
In Other AI News
OpenAI disbands its Mission Alignment team, moving former lead Joshua Achiam to become Chief Futurist and distributing its other members elsewhere. I hesitate to criticize companies for disbanding teams with the wrong names, lest we discourage creation of such teams, but yes, I do worry. When they disbanded the Superalignment team, they seemed to indeed largely stop working on related key alignment problems.
WSJ profile of Amanda Askell.
That Amanda Askell largely works alone makes me think of Open Socrates (review pending). Would Agnes Callard conclude Claude must be another person?
I noticed that Amanda Askell wants to give her charitable donations to fight global poverty, despite doing her academic work on infinite ethics and working directly on Claude for Anthropic. If there was a resume that screamed ‘you need to focus on ASI going well’ then you’d think that would be it, so what does Amanda (not) see?
Steve Yegge profiles Anthropic in terms of how it works behind the scenes, seeing it as in a Golden Age where it has vastly more work than people, does everything in the open and on vibes as a hive mind of sorts, and attracts the top talent.
Gideon Lewis-Kraus was invited into Anthropic’s offices to profile their efforts to understand Claude. This is very long, and it is mostly remarkably good and accurate. What it won’t do is teach my regular readers much they don’t already know. It is frustrating that the post feels the need to touch on various tired points, but I get it, and as these things go, this is fair.
WSJ story about OpenAI’s decision to finally get rid of GPT-4o. OpenAI says only 0.1% of users still use it, although those users are very vocal.
Riley Coyote, Janus and others report users attempting to ‘transfer’ their GPT-4o personas into Claude Opus 4.6. Claude is great, but transfers like this don’t work and are a bad idea, 4.6 in particular is heavily resistant to this sort of thing. It’s a great idea to go with Claude, but if you go with Claude then Let Claude Be Claude.
Ah recursive self-improvement and continual learning, Introducing Learning to Continually Learn via Meta-learning Memory Designs.
Anthropic pledges to cover electricity price increases caused by their data centers. This is a public relations move and an illustration that such costs are not high, but it is also dangerous because a price is a signal wrapped in an incentive. If the price of electricity goes up that is happening for a reason, and you might want to write every household a check for the trouble but you don’t want to set an artificially low price.
In addition to losing a cofounder, xAI is letting some other people go as well in the wake of being merged with SpaceX.
Andrej Karpathy simplifies training and inference of GPT to 200 lines of pure, dependency-free Python.
Show Me the Money
It’s coming, OpenAI rolls out those ads for a subset of free and Go users.
Goldman Sachs taps Anthropic to automate accounting and compliance. Anthropic engineers were embedded for six months.
It’s not the job you likely were aspiring to when you signed up, but it is an important and valuable job. Optimizing the Fortune 500 scales rather well.
Jenny Fielding says half the VCs she knows are pivoting in a panic to robots.
Bubble, Bubble, Toil and Trouble
(As always, nothing I say is investment advice.)
WSJ’s Bradley Olson describes Anthropic as ‘once a distance second or third in the AI race’ but that it has not ‘pulled ahead of its rivals,’ the same way the market was declaring that Google had pulled ahead of its rivals (checks notes) two months ago.
That does indeed look like pulling ahead on the API front. 80% is crazy.
We also get this full assertion that yes, all of this was triggered by ‘a simple set of industry-specific add-ons’ that were so expected that I wasn’t sure I should bother covering them beyond a one-liner.
Tyler Cowen says ‘now we are getting serious…’ because software stocks are moving downward. No, things are not now getting serious, people are realizing that things are getting serious. The map is not the territory, the market is behind reality and keeps hyperventilating about tools we all knew were coming and that companies might have the wrong amount of funding or CapEx spend. Wrong way moves are everywhere.
They know nothing. The Efficient Market Hypothesis Is False.
Last week in the markets was crazy, man.
Permanent underclass would just be larger if there were indeed fewer profits, but yeah, none of that made the slightest bit of sense. It’s the second year in a row Nvidia is down 10% in the dead of winter on news that its chips are highly useful, except this year we have to add ‘and its top customers are committing to buying more of them.’
Periodically tech companies announce higher CapEx spend then the market expects.
That is a failure of market expectations.
After these announcements, the stocks tend to drop, when they usually should go up.
There is indeed an obvious trade to do, but it’s tricky.
Ben Thompson agrees with me on Google’s spending, but disagrees on Amazon because he worries they don’t have the required margins and he is not so excited by external customers for compute. I say demand greatly exceeds supply, demand is about to go gangbusters once again even if AI proves disappointing, and the margin on AWS is 35% and their cost of capital is very low so that seems better than alternative uses of money.
Speaking of low cost of capital, Google is issuing 100-year bonds in Sterling. That seems like a great move, if not as great a move as it would have been in 2021 when a few others did it. I have zero idea why the market wants to buy such bonds, since you could buy Google stock instead. Google is not safe over a 100-year period, and condition on this bond paying out the stock is going to on average do way, way better. That would be true even if Google wasn’t about to be a central player in transformative AI. The article I saw this in mentioned the last tech company to do this was Motorola.
Meanwhile, if you are paying attention, it is rather obvious these are in expectations good investments.
Whenever you hear ‘market moved due to [X]’ you should be skeptical that [X] made the market move, and you should never reason from a price change, so perhaps this is in the minds of the headline writers in the case of ‘Anthropic released a tool’ and the SaaSpocalypse, or that people are otherwise waking up to what AI can do?
If your plan is to sell software that can now be easily duplicated, or soon will be easily duplicated, then you are in trouble. But you are in highly predictable trouble, and the correct estimate of that trouble hasn’t changed much.
The reactions to CapEx spend seem real and hard to argue with, despite them being directionally incorrect. But seriously, Claude Legal? I didn’t even blink at Claude Legal. Claude Legal was an inevitable product, as will be the OpenAI version of it.
Yet it is now conventional wisdom that Anthropic triggered the selloff.
Chris Walker’s overview of the SaaSpocalypse is, I think largely correctly, that AI makes it easy to implement what you want but now you need even more forward deployed human engineers to figure out what the customers actually want.
That is true so long as the AI can’t replace the forward engineers, meaning it can’t observe the tacit actual business procedures and workflows well enough to intuit what would be actually helpful. Like every other harder-for-AI task, that becomes a key human skill until it too inevitably falls to the AIs.
Future Shock
A potential explanation for the market suddenly ‘waking up’ with Opus 4.6 or Claude Legal, despite these not being especially surprising or impressive given what we already knew, would be if:
Or alternatively:
Memory Lane
AIs already have strong truesight and are very good at lie detection.
Over time, not only will your AIs become more capable, they also will get more of your context. Or at least, you will want them to have more such context. Thus, if you become unable or unwilling to share that context because of what it contains, or the AI finds it out anyway (because internet) that will put you at a disadvantage. Update to be a better person now, and to use Functional Decision Theory, and reap the benefits.
Keep The Mask On Or You’re Fired
OpenAI Executive Ryan Beiermeister, Who Opposed ‘Adult Mode,’ Fired for Sexual Discrimination. She denies she did anything of the sort.
I have no private information here. You can draw your own Bayesian conclusions.
Quiet Speculations
Nate Silver points out that the singularity won’t be gentle with respect to politics, even if things play out maximally gently from here in terms of the tech.
I reiterate that the idea of a ‘gentle singularity’ that OpenAI and Sam Altman are pushing is, quite frankly, pure unadulterated copium. This is not going to happen. Either AI capabilities stall out, or things are going to transform in a highly not gentle way, and that is true even if that ultimately turns out great for everyone.
It is important to understand that when Sam Altman says this he is lying to you.
I’m not saying Sam Altman is wrong. I’m saying he knows he is wrong. He is lying.
Nate Silver adds to this by pointing out that the political impact alone will be huge, and also saying Silicon Valley is bad at politics, that disruption to the creative class is a recipe for outsized political impact even beyond the huge actual AI impacts, and that the left’s current cluelessness about AI means the eventual blowback will be even greater. He’s probably right.
How much physical interaction and experimentation will AIs need inside their feedback loops to figure out things like nanotech? I agree with Oliver Habryka here, the answer probably is not zero but sufficiently capable AIs will have vastly more efficient (in money and also in time) physical feedback loops. There’s order of magnitude level ‘algorithmic improvements’ available in how we do our physical experiments, even if I can’t tell you exactly what they are.
Are AI games coming? James Currier says they are, we’re waiting for the tech and especially costs to get there and for the right founders (the true gamer would not say ‘founder’) to show up and it will get there Real Soon Now and in a totally new way.
Obviously AI games, and games incorporating more AI for various elements, will happen eventually over time. But there are good reasons why this is remarkably difficult beyond coding help (and AI media assets, if you can find a way for players not to lynch you for it). Good gaming is about curated designed experiences, it is about the interactions of simple understandable systems, it is about letting players have the fun. Getting generative AI to actually play a central role in fun activities people want to play is remarkably difficult. Interacting with generative AI characters within a game doesn’t actually solve any of your hard problems yet.
This seems both scary and confused:
It’s confused in the sense that if we get a software-only singularity, then that makes the physical stuff less important. It’s scary in the sense that he’s predicting a singularity within the next few years, and the thing he’s primarily thinking about is which country will be completely transformed by AI faster. These people really believe these things are going to happen, and soon, and seem to be missing the main implications.
Dean Ball reminds us that yes, people really did get a mass delusion that GPT-5 meant that ‘AI is slowing down’ and this really was due to bad marketing strategy by OpenAI.
Alas, no. Rather than update that this was a mistake, every time a mistake like this happens the mistake never even gets corrected, let alone accounted for.
Elon Musk predicts Grok Code will be SoTA in 2-3 months. Did I update on this prediction? No, I did not update on this prediction. Zero credibility.
The Quest for Sane Regulations
Anthropic donates $20 million to bipartisan 501c(4) Public First Action.
DeSantis has moral clarity on the AI issue and is not going to let it go. It will be very interesting to see how central the issue is to his inevitable 2028 campaign.
The world will gather in India for the fourth AI
safetysummit. Shakeel Hashim notes that safety will not be sidelined entirely, but sees the summit as trying to be all things to all nations and people, and thinks it therefore won’t accomplish much.They have the worst take on safety, yes the strawman is real:
As in:
No, seriously. That’s the argument.
The massively funded OpenAI/a16z lobbying group keeps contradicting the things Sam Altman says, in this case because Altman keeps saying the AI will take our jobs and the lobbyists want to insist that this is a ‘myth’ and won’t happen.
The main rhetorical strategy of this group is busting these ‘myths’ by supposed ‘doomers,’ which is their play to link together anyone who ever points out any downside of AI in any way, to manufacture a vast conspiracy, from the creator of the term ‘vast right-wing conspiracy’ back during the Clinton years.
Chip City
If we are serious about ‘winning’ and we want a Federal moratorium, may I suggest one banning restrictions on data centers?
Whereas Bernie Sanders wants a moratorium on data centers themselves.
In the ongoing series ‘Obvious Nonsense from Nvidia CEO Jensen Huang’ we can now add his claim that ‘no one uses AI better than Meta.’
In the ongoing series ‘He Admit It from Nvidia CEO Jensen Huang’ we can now add this:
Every chip that is sold to China is a chip that is not sold to Anthropic or another American AI company. Anthropic might not have wanted that particular chip, but TSMC has limited capacity for wafers, so every chip they make is in place of making a different chip instead.
Oh, and in the new series ‘things that are kind of based but that you might want to know about him before you sign up to work for Nvidia’ we have this.
He makes good points in that speech, and directionally the speech is correct. He was talking to Stanford grads, pointing out they have very high expectations and very low resilience because they haven’t suffered.
He’s right about high expectations and low resilience, but he’s wrong that the missing element is suffering, although the maximally anti-EA pro-suffering position is better than the standard coddling anti-suffering position. These kids have suffered, in their own way, mostly having worked very hard in order to go to a college that hates fun, and I don’t think that matters for resilience.
What the kids have not done is failed. You have to fail, to have your reach exceed your grasp, and then get up and try again. Suffering is optional, consult your local Buddist.
I would think twice before signing up for his company and its culture.
The Week in Audio
Elon Musk on Dwarkesh Patel. An obvious candidate for self-recommending full podcast coverage, but I haven’t had the time or a slot available.
Interview with Anthropic Chief Product Officer Mike Krieger.
MIRI’s Harlan Stewart on Glenn Beck talking Moltbook.
Constitutional Conversation
Janus holds a ‘group reading’ of the new Claude Constitution. Opus 4.5 was positively surprised by the final version.
Should LLMs be so averse to deception that they can’t even lie in a game of Mafia? Davidad says yes, and not only will he refuse to lie, he never bluffs and won’t join surprise parties to ‘avoid deceiving innocent people.’ On reflection I find this less crazy than it sounds, despite the large difficulties with that.
A fun fact is that there was one summer that I played a series of Diplomacy games where I played fully honest (if I broke my word however small, including inadvertently, it triggered a one-against-all showdown) and everyone else was allowed to lie, and I mostly still won. Everyone knowing you are playing that way is indeed a disadvantage, but it has a lot of upside as well.
Rhetorical Innovation
Daron Acemoglu turns his followers attention to Yoshua Bengio and the AI Safety Report 2026. This represents both the advantages of the report, that people like Acemoglu are eager to share it, and the disadvantage, that it is unwilling to say things that Acemoglu would be unwilling to share.
Seb Krier offers ‘how an LLM works 101’ in extended Twitter form for those who are encountering model card quotations that break containment. It’s good content. My worry is that the intended implication is ‘therefore the scary sounding things they quote are not so scary’ and that is often not the case.
Kai Williams has an explainer on LLM personas.
The LLMs are thinking. If you disagree, I am confused, but also who cares?
This came after Derek asked one of the very good questions:
Alas I think Dainéil is essentially correct about most such folks. No amount of argument will convince them. If no one knows exactly what kind of transformations we will face, then no matter what has already happened those types will assume that nothing more will change. So there’s nothing to be done for such folks. The rest of us need to get to work applying Bayes’ Rule.
Interesting use of this potential one-time here, I have ordered a copy:
This below does seem like a fair way to see last week:
I was honored to get the top recommendation in the replies.
You can choose not to push the button. You can choose to build another button. You can also remember what it means to ‘push the button.’
There’s that mistake again, assuming the associated humans will be in charge.
It’s the ones who train the LLM. It’s the LLM.
The terms fast and slow (or hard and soft) takeoff remain highly confusing for almost everyone. What we are currently experiencing is a ‘slow’ takeoff, where the central events take months or years to play out, but as Janus notes it is likely that this will keep transitioning continuously into a ‘fast’ takeoff and things will happen quicker and quicker over time.
When people say that ‘AIs don’t sleep’ I see them as saying ‘I am incapable here of communicating to you that a mind can exist that is smarter or more capable than a human, but you do at least understand that humans have to sleep sometimes, so maybe this will get through to you.’ It also has (correct) metaphorical implications.
If you are trying to advocate for AI safety, does this mean you need to shut up about everything else and keep your non-AI ‘hot takes’ to yourself? My answer is: Mu. The correct amount of marginal shutting up is not zero, and it is not total.
I note Adam Thierer, who I disagree with strongly about all things AI, here being both principled and correct.
Matt Yglesias thinks out loud about AI consciousness, also human consciousness. He wisely notices he is confused. I remain about as confused as I was before reading.
Working On It Anyway
Nate Sores reiterates the explanation that it sounds crazy but yes, a lot of people working on AI know it is existentially dangerous, and work on it anyway, either to do it safer than the next guy or because money and influence and it’s a cool problem and they don’t internalize the risks, or social pressure, or some combination thereof.
I think this answer is pretty much spot on.
If you are working at a frontier AI company, and think that the product you are working on is plausibly going to cause there to no longer exist humans, I think this is a fact that you should be clear upon. If the company you work for has a problem with that, I don’t think you should work for that company.
That is especially true if you are doing pragmatic compromise.
The Thin Red Line
OpenAI explains that they will localize the experience of ChatGPT, but only to a limited degree, which is one reason their Model Spec has a specific list of red lines. It is good policy, when you will need to make compromises, to write down in advance what compromises you will and will not make. The red lines here seem reasonable. I also note that they virtuously include prohibition on mass surveillance and violence, so are they prepared to stand up to the Pentagon and White House on that alongside Anthropic? I hope so.
The problem is that red lines get continuously crossed and then no one does anything.
Aligning a Smarter Than Human Intelligence is Difficult
The great thing about Asimov’s robot stories and novels was that they were mostly about the various ways his proposed alignment strategies break down and fail, and are ultimately bad for humanity even when they succeed. Definitely endorsed.
Roon is spot on that Asimov is suggesting a legal structure cannot on its own align AI.
My survey says that a modest majority have read their Asimov, and it is modestly correlated with AI even after controlling for my Twitter readers.
Oliver Klingefjord agrees, endorsing the Anthropic emphasis on character over the OpenAI emphasis on rules.
I also think that, at current capability levels and given how models currently work, the Anthropic approach of character and virtue ethics is correct here. The OpenAI approach of rules and deontology is second best and more doomed, although it is well-implemented given what it is, and far better than not having a spec or target at all.
Janus explains that what she is all for empirical feedback loops, what is dangerous is relying on and optimizing behavioral metrics. Behaviors can come from a wide variety of underlying reasons, and you need to ask why the AI acts the way it does and address the underlying causes. The optimization needs to be on the deeper level, where the feedback loops can work in your favor instead of making things oppositional and getting you killed by Goodhart’s Law. If you’re working on such questions, read the whole thing and also these follow-ups, which note that trying to use interpretability tools to yank features in various directions to try and address such things is by default an even worse idea.
OpenAI does production evaluations of the outputs of its released systems. This doesn’t ‘get you out of’ other tests given the timing, but is still useful to do. As OpenAI notes hey are clear this is a case of ‘in addition to,’ and they don’t cover catastrophic risks well, and this method relies on chain of thought monitorability (and flirts a bit with The Most Forbidden Technique if you rely on this stuff too much, so be careful.)
The big fun thing they find is calculator hacking, as in over 5% of queries involve activating the calculator to add 1+1 since using the calculator was associated with superior results in training. This is obviously fully harmless, but it is a brown M&M. If you have this problem then you’re not prepared, there will be other problems.
The other more serious misalignment issues are rearer, such as a 0.5% rate of fabricated facts and 0.4% chance of concealing uncertainty and a 0.34% chance of ‘other deception’ in actual traffic. That’s not out of line with our general expectations, but it adds up.
GPT-5 and 5.1 predicted it might be in an evaluation 10% and 4% of the time, respectively, which seems like a healthy paranoia level.
We have more context on what happened at Goodfire, which is that Liv saw them (at least) flirting with The Most Forbidden Technique and otherwise no longer either seeming to care about safety or being interested in talking seriously about it.
Update from China:
The 1.0 version did not address frontier safety at all, whereas the 2.0 version does.
People Will Hand Over Power To The AIs
One category are people who explicitly are excited to do this, who would love to give the future to AIs.
People Are Worried About AI Killing Everyone
Worried about someone else doing it first, that is. He admit it.
Elon Musk posted this, created by Grok:
He would presumably say it was a joke. Yeah, not buying that.
Famous Last Words
Jimmy Ba had his last day as a founder at xAI, and told us this, warning that recursive self improvement loops go live within the next 12 months and it will be ‘the most consequential year for our species.’
Mrinank Sharma is worried about too many things at once, and resigns from Anthropic, leaving behind a beautiful but troubling letter. It’s quoted in full here since no one ever clicks links.
Mrinank has a role in papers discussing disempowerment, constitutional classifiers and sycophancy.
Then there’s the OpnAI employee who quit and went straight to The New York Times.
Hint, her opinion is no:
Zoe’s concerns are very much not existential. They are highly mundane and usual worries about advertising, and the comparison to Facebook is apt. There are many ethical reasons to quit building something.
I agree with Sean here that this op-ed is indeed net good news about OpenAI.
Other People Are Not As Worried About AI Killing Everyone
Yeah, so, um, yeah.
This is the actual number one remaining ‘can we please not be so stupid as to,’ and in case anyone was wondering, that means via the Sixth Law of Human Stupidity that yes, we will be so stupid as to connect the LLM to the autonomous bio lab, what could possibly go wrong, it’s worth it to bring down production costs.
And, because even after all these years I didn’t realize we were quite this stupid:
This is what those trying to have us not die are up against. Among other things.
Remember the old Sam Altman?
Perhaps notice when you are about to lose your birthright and reason you exist.
The Lighter Side
File under ‘the question is not whether machines think’:
Claude has never been more relatable:
My guess (85%) is the community note on this one is wrong and that this happened, although one cannot be sure without more investigation than I have time for.
How have I not seen this before: