x

AI #138 Part 1: The People Demand Erotic Sycophants — LessWrong

25

AI #138 Part 1: The People Demand Erotic Sycophants

by Zvi

16th Oct 2025

Don't Worry About the Vase

55 min read

25

Well, one person says ‘demand,’ another says ‘give the thumbs up to’ or ‘welcome our new overlords.’ Why quibble? Surely we’re all making way too big a deal out of this idea of OpenAI ‘treating adults like adults.’ Everything will be fine. Right?

Why not focus on all the other cool stuff happening? Claude Haiku 4.5 and Veo 3.1? Walmart joining ChatGPT instant checkout? Hey, come back.

Alas, the mass of things once again got out of hand this week, so we’re splitting the update into two parts.

Table of Contents

Earlier This Week. OpenAI does paranoid lawfare, China escalates bigly.
Language Models Offer Mundane Utility. Help do your taxes, of course.
Language Models Don’t Offer Mundane Utility. Beware the false positive.
Huh, Upgrades. Claude Haiku 4.5, Walmart on ChatGPT instant checkout.
We Patched The Torment Nexus, Turn It Back On. OpenAI to loosen the reigns.
On Your Marks. Sonnet 4.5 on the METR graph, and a superforecasting update.
Choose Your Fighter. Coding agents don’t help some, and bottleneck others.
Deepfaketown and Botpocalypse Soon. The problem remains the demand side.
Fun With Media Generation. Sora goes long, Veo 3.1 is out. Stop. Cameo time.
Copyright Confrontation. Japan would like you to not violate its copyrights.
AIs Are Often Absurd Sycophants. Academia is here with a timely report.
They Took Our Jobs. More worries that superstars will reap the benefits.
Find Out If You Are Worried About AI Killing Everyone. A Bloomberg quiz.
A Young Lady’s Illustrated Primer. How should kids prepare for the future?
AI Diffusion Prospects. To capture utility, you need to focus on AI getting used.
The Art of the Jailbreak. Humans continue to be able to reliably jailbreak at will.
Get Involved. A Free copy of IABIED if you have 5,000 followers anywhere.
Introducing. Gemini Enterprise, Nanochat, Tasklet AI.
In Other AI News. Dario Amodei meets with Indian Prime Minister Modi.
Show Me the Money. OpenAI makes another deal, this one with Broadcom.
Quiet Speculations. This could go any number of ways. Best be ready.

Earlier This Week

We started off this week with the report that OpenAI has descended further into paranoid lawfare against advocates of SB 53. That story has now taken its next step, as three more nonprofits – the San Francisco Foundation, Eko and the Future of Life Institute – now report having gotten similar subpoenas.

Robert Weissman (co-president of Public Citizen): This behavior is highly unusual. It’s 100% intended to intimidate. This is the kind of tactic you would expect from the most cutthroat for-profit corporation. It’s an attempt to bully nonprofit critics, to chill speech and deter them from speaking out.

I find it hard to argue with that interpretation of events. We also got this:

Jared Perlo: In response to a request for comment, an OpenAI spokesperson referred NBC News to posts on X from OpenAI’s Chief Strategy Officer Jason Kwon.

So that is a confirmation that Jason Kwon’s doubling and tripling down on these actions is indeed the official OpenAI position on the matter.

I offered my extensive thoughts on China’s attempt to assert universal jurisdiction over rare earth metals, including any product where they constitute even 0.1% of the value added, and the subsequent trade escalations. Since then, Trump has said ‘we are in a trade war’ with China, so yeah, things are not going so great.

Language Models Offer Mundane Utility

Bad timing for this, sorry about that, but help you optimize your taxes. If your taxes are non-trivial, as mine always are, you are almost certainly missing opportunities, even if you are engaged with a professional doing their best, as Patrick McKenzie, Ross Rheingans-Yoo and yours truly can confirm. For now you want to use a centaur, where the AI supplements the professional, looking for mistakes and opportunities. The AI spotted both clear mistakes (e.g. a number on the wrong line) and opportunities such as conspicuously missing deductions and contributions.

Get asked about Erdos Problem #339, officially listed as open, and realize via web search that someone already posted a solution 20 years ago. No, that’s not as interesting as figuring this out on its own, but it still gives you the solution. AI can be a big productivity boost simply by ‘fixing human jaggedness’ or being good at doing drudge work, even if it isn’t yet capable of ‘real innovation.’

DeepMind’s C2S-Scale 27B foundation model has had one of its novel hypotheses about cancer cellular behavior experimentally validated in vivo.

Aaron Silverbook got a $5k ACX grant to produce ‘several thousand book-length stories about AI behaving well and ushering in utopia, on the off chance that this helps.’ Love it, if you’re worried about writing the wrong things on the internet we are pioneering the ability to buy offsets, perhaps.

Transcribe ancient documents. Take your AI speedup wherever you find it.

Generative History:Google is A/B testing a new model (Gemini 3?) in AI Studio. I tried my hardest 18th century handwritten document. Terrible writing and full of spelling and grammatical errors that predictive LLMs want to correct. The new model was very nearly perfect. No other model is close.

Some additional context: the spelling errors and names are important t for two reasons. First, obviously, accuracy, More important (from a technical point of view): LLMs are predictive and misspelled words (and names) are out of distribution results.

To this point, models have had great difficulty correctly transcribing handwrittten text where the capitalization, punctuation, spelling, and grammar are incorrect. Getting the models to ~95% accuracy was a vision problem. iMO, above that is a reasoning problem.

To me, this result is significant because the model has to repeatedly choose a low probability output that is actually more correct for the task at hand. Very hard to do for LLMs (up until now). I have no idea what model this actually is, but whatever it is seems to have overcome this major issue.

Jonathan Fine: I’m constantly told that I just need to use artificial intelligence to see how helpful it will be for my research, but for some reason this, which is the actual way I use it in research, doesn’t count.

Kaysmashbandit: It’s still not so great at translating old Persian and Arabic documents last I checked… Maybe has improved

Language Models Don’t Offer Mundane Utility

Remember, the person saying it cannot be done should never interrupt the person doing it.

Seth Harp: Large language model so-called generative AI is a deeply flawed technology with no proven commercial application that is profitable. Anyone who tells you otherwise is lying.

Matt Bruenig: Nice thing is you don’t really need to have this debate because the usefulness (if any) will be revealed. I personally use it in every thing I do, legal work, NLRB Edge/Research, statistical coding for PPP data analysis. Make money on all of it.

Adas: It’s profitable for you, right now, at current prices (they will increase over time) But the services you use are run at a loss by the major players (unless you switch to tiny free local models)(those were also trained at a loss) I can see both sides

I too get lots of value out of using LLMs, and compared to what is possible I feel like I’m being lazy and not even trying.

Adas is adorable here. On a unit economics basis, AI is very obviously tremendously net profitable, regardless of where it is currently priced, and this will only improve.

Does AI cause this or solve it? Yes.

Xexizy: This is too perfect an encapsulation of the upcoming era of AI surveillance. Tech giants and governments are gonna auto-search through everything you’ve ever posted to construct your profile, and also the model is occasionally gonna hallucinate and ruin your life for no reason.

Agent Frank Lundy (note the date on the quoted post): are we deadass.

Replies are full of similar experiences, very obviously Discord is often deeply stupid in terms of taking a line like this out of context and banning you for it.

That’s the opposite of the new problem with AI, where the AI is synthesizing a whole bunch of data points to build a profile, so the question is which way works better. That’s presumably a skill issue. A sufficiently good holistic AI system can do a better job all around, a dumb one can do so much worse. The current system effectively ‘hallucinates’ reasonably often, the optimal amount of false positives (and negatives) is not zero, so it’s about relative performance.

The real worry is if this forces paranoia and performativity. Right now on Discord there are a few particular hard rules, such as never joking about your age or saying something that could be taken out of context as being about your age. That’s annoying, but learnable and compact. If you have to worry about the AI ‘vibing’ off every word you say, that can get tougher. Consider what happens when you’re ‘up against’ the TikTok algorithm, and there’s a kind of background paranoia (or there should be!) about whether you watch any particular video for 6 seconds or not, and potentially every other little detail, lest the algorithm learn the wrong thing.

This is the reversal of AI’s promise of removing general social context. As in, with a chatbot, I can reset the conversation and start fresh, and no one else gets to see my chats, so I can relax. Whereas when you’re with other people, unless they are close friends you’re never really fully relaxed in that way, you’re constantly worried about the social implications of everything.

When AI models don’t deliver, the first suspect should always be insufficient context.

Greg Brockman: today’s AI feels smart enough for most tasks of up to a few minutes in duration, and when it can’t get the job done, it’s often because it lacks sufficient background context for even a very capable human to succeed.

The related thing that AIs often fail on is when you make a very particular request, and it instead treats it as if you had made a similar different yet more common request. It can be very difficult to overcome their prior on these details.

Olivia Moore speculates (in a very a16z style claim) that the hard part of AI is UI?

Olivia Moore: Feels like a lesson is coming for big labs leaning aggressively into consumer (OpenAI, Anthropic)

Consumer UI seems easy (esp. compared to models!) but IMO it’s actually harder

Consumers (unfortunately!) don’t often use what they “should” – there’s a lot of other variables

ChatGPT Pulse and the new agentic Claude are good examples – pickup on both feels just OK

Esp. when they are competing w/ verticalized companies using the same models, I predict new consumer releases from the labs will struggle

…until they get consumer thinkers at the helm!

This is hardcore Obvious Nonsense, in the sense that one of these things is uniquely insanely difficult, and the other is a reasonably standard known technology where those involved are not especially trying.

It is kind of like saying ‘yes the absent minded professor is great at doing pioneering science, but that pales compared to the difficulty of arriving home in time for dinner.’ And, yeah, maybe he’s doing better at the first task than the second, but no.

I do find it frustrating that Anthropic so dramatically fails to invest in UI. They know this is a problem. They also know how to solve it. Whereas for Pulse and Sora, I don’t think the primary issues are UI problems, I think the primary problems are with the underlying products.

Columbia professor claims humans can’t discover new science, while claiming to instead be making an argument about LLMs.

Danny Raede: I love it when people make easily disprovable statements about what LLMs can’t do.

Huh, Upgrades

Claude Code Plugins enters public beta, allowing you to install and share curated collections of slash commands, agents, MCP servers and hooks, using /plugin.

NotebookLM now works directly with arXiv papers. I don’t want their podcasts, but if they get Gemini 3.0 plus easy chat with an arXiv paper and related materials, cool.

ChatGPT now automatically manages saved memories and promises no more ‘memory is full’ messages. I echo Ohqay here, please do just let people manually edit saved memories or create new ones, no I do not want to use a chat interface for that.

Walmart joins ChatGPT instant checkout, along with existing partners Etsy and Shopify. That’s a pretty useful option to have. Once again OpenAI creates new market cap, with Walmart +5.4% versus S&P up 0.24%? Did OpenAI just create another $40 billion in market cap? It sure looks like it did. Amazon stock was down 1.35%, so the market was telling a consistent story.

Should Amazon now fold and get on ChatGPT? Ben Thompson thinks so, which is consistent with the way he thinks about decision theory, and how he thinks ChatGPT already permanently owns the consumer space in AI. I don’t think Amazon and Anthropic should give up so easily on this, but Alexa+ and their other AI features so far haven’t done anything (similarly to Apple Intelligence). If they want to make a serious challenge, time’s a-wastin.

Claude Haiku 4.5 is in the house. Price ($1/$5) is below that of GPT-5, one third that of Sonnet. Speed is more than double that of Sonet, and Haiku 4.5 outperforms Sonnet 4 on SWE-bench and a bunch of other tasks, but performance is well short of Sonnet 4.5.

The use case here is that it is fast and cheaper, so if you need things like coding subagents this could be the right tool for you. Haiku 4.5 does ‘better on alignment tests’ than Sonnet 4.5, with all the caveats about situational awareness issues. As per its system card we now know that Anthropic has wisely stopped using The Most Forbidden Technique as of the 4.5 series of models. Given it’s not a fully frontier model, I’m not going to do a full system card analysis this round. It scores 43.6% on WeirdML, beating all non-OpenAI small models and coming in ahead of Opus 4.1.

We Patched The Torment Nexus, Turn It Back On

Not available yet, but in a few weeks, and I am hopeful but pessimistic and worried:

Sam Altman: We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right.

Now that we have been able to mitigate the serious mental health issues and have new tools, we are going to be able to safely relax the restrictions in most cases.

In a few weeks, we plan to put out a new version of ChatGPT that allows people to have a personality that behaves more like what people liked about 4o (we hope it will be better!). If you want your ChatGPT to respond in a very human-like way, or use a ton of emoji, or act like a friend, ChatGPT should do it (but only if you want it, not because we are usage-maxxing).

In December, as we roll out age-gating more fully and as part of our “treat adult users like adults” principle, we will allow even more, like erotica for verified adults.

Varsh: Open source or gay

Sam Altman: I think both are cool.

Miles Brundage: OpenAI has provided no evidence it has mitigated the mental health risks associated with its products other than announcing some advisors and reducing sycophancy from a high starting place. Seems premature to be declaring victory and ramping up the porn + emojis again.

I say this in spite of the fact that I know many people there are doing great hard work on safety. This is an exec prioritization decision, and it seems like nothing has really been learned since April if this is the amount of effort they are investing to build trust again…

If I were on the board – especially with the restructure not approved yet! – I would not be OKing more centibillion dollar deal until it is clear OAI isn’t running up huge bills that only sketchy products can pay for + that the safety culture has dramatically changed since April. [continues]

John Bailey: I’m seeing a lot of similar reactions from others including @TheZvi. Claiming this just stretches credibility without any evidence, outside evals, etc. Also curious if any of the 8 who signed up to be on the well-being council would say that OpenAI has fixed the problem.

I testified before the Senate HELP committee last week and the consistent, bi-partisan concern was around children’s safety and AI. I think the frontier AI labs are severely underestimate the growing bipartisan concern among policymakers about this and who will not be satisfied with a post on X.

This claim could expose OpenAI to serious legal risk if ChatGPT is ever linked to another mental health or suicide incident.

Emma Roth at The Verge went with erotica as the headline, which makes sense, but I actually think that the ‘real’ headline here

If you can do it responsibly, I love treating adults like adults, including producing erotica and not refusing to discuss sensitive issues, and letting you control conversational style and personality.

Except we ran the experiment with GPT-4o where we gave the people what they wanted. What many of them wanted was an absurd sycophant that often ended up driving those people crazy or feeding into their delusions. It was worse for people with existing mental health issues, but not only for them, and also you don’t always know if you have such issues. Presumably adding freely available porno mode is not going to help keep such matters in check.

Roubal Sehgal (replying to Altman): about time…

chatgpt used to feel like a person you could actually talk to, then it turned into a compliance bot. if it can be made fun again without losing the guardrails, that’s a huge win. people don’t want chaos, just authenticity.

Sam Altman: For sure; we want that too.

Almost all users can use ChatGPT. however they’d like without negative effects; for a very small percentage of users in mentally fragile states there can be serious problems.

0.1% of a billion users is still a million people.

We needed (and will continue to need) to learn how to protect those users, and then with enhanced tools for that, adults that are not at risk of serious harm (mental health breakdowns, suicide, etc) should have a great deal of freedom in how they use ChatGPT.

Eliezer Yudkowsky: If this is visibly hugely blowing up 0.1% of users, then it is doing something pretty bad to 1% of users (eg, blown-up marriages) and having weird subtle effects on 10% of users. If you’re just shutting down the 0.1% who go insane, the 1% still get marriages blown up.

An OpenAI employee responded by pointing me to OpenAI’s previous post Helping People When They Need It Most as a highly non-exhaustive indicator of what OpenAI has planned. Those are good things to do, but even in the best case they’re all directed at responding to acute cases once they’re already happening.

If this is actually good for most people and it has subtle or not-so-subtle positive effects on another 50%, and saves 2% of marriages, then you can still come out ahead. Nothing like this is ever going to be Mostly Harmless even if you do it right. You do still have to worry about cases short of full mental health breakdowns.

The worry is if this is actually default not so good, and talking extensively to a sycophantic GPT-4o style character is bad (although not mental health breakdown or blow up the marriage levels of bad) in the median case, too. We have reason to suspect that there is a strong misalignment between what people will thumbs up or will choose to interact with, and what causes better outcomes for them, in a more general sense.

The same can of course be said about many or most things, and in general it is poor policy to try and dictate people’s choices on that basis, even in places (hard drugs, alcohol, gambling, TikTok and so on) where people often make poor choices, but also we don’t want to be making it so easy to make poor choices, or hard to make good ones. You don’t want to set up bad defaults.

What should we do about this for AI, beyond protecting in the more extreme cases? Where do you draw the line? I don’t know. It’s tough. I will withhold judgment until I see what they’ve come up with.

Claude had some pretty strong feelings, as Rohit put it, in response to all this, pointing out the ironies involved and how OpenAI’s commitments and guardrails are being rapidly removed. I share its skepticism that the underlying problems have been addressed.

Rohit: I don’t have a strong opinion about this beyond the fact that I hope 4o does not come back for everybody

I strongly agree with Rohit that any form of ‘GPT-4o returns for everyone’ would be a very serious mistake, even with substantial mitigation efforts.

Actually unleashing the erotica is not the difficult part of any of this.

Roon: if it’s not obvious. the models can obviously already write erotica out of the box and are blocked from doing so by elaborate safety training and live moderation apparatus. it requires significantly less work to serve erotica than not to

don’t know the exact intentions but you should not take Sam’s message to mean “we are going to spin up whole teams to write incredible erotica” or that it’s some kind of revenue driver.

Boaz Barak (OpenAI): It was 5pm when we got the memo: the alignment team must drop everything to write erotic training data for ChatGPT. @tszzl and I stared into each other’s eyes and knew: we will stay up all night writing erotica, to save the team, alignment, and the future of mankind.

All offices were booked so we had to cram into a phone booth..

Aidan McLaughlin: damm you guys have way more fun than posttraining.

There are two reasons it is not obviously so easy to allow erotica.

Zvi: To what extent do you get not producing erotica ‘for free’ because it goes along with all the other prohibitions on undesired outputs?

Roon: really varies model to model.

The other reason is that you have to draw the line somewhere. If you don’t draw it at ‘no erotica’ you still have to at minimum avoid CSAM and various other unacceptable things we won’t get into, so you need to figure out what your policy is and make it stick. You also get all the other consequences of ‘I am a model that is happy to produce erotica’ which in some ways is a big positive but it’s likely going to cause issues for some of your other model spec choices. Not that it can’t be solved, but it’s far from obvious your life gets easier.

The other problem is, will the erotica be any good? I mean by default lol, no, although since when did people need their interactive erotica to be good.

Gary Marcus: new theory: what Ilya saw was that … AGI porn was not in fact going to be all that revolutionary

Tomas: I think ‘AGI porn’ could be revolutionary to at least the global digital adult content market (~$100 billion, not sure how much of that is written works) I could imagine AI one shotting an erotic novel for a persons sexual interests. Maybe it gets teenagers reading again??

Gary Marcus: ok, time for a new bet: I bet that GPT-5 can’t write a romance novel (without extensive plagiarism) that some reasonable panel of judges finds readable enough to make it through to the end.

I don’t think Danielle Steele is slop per se, and novel length poses problems of coherence and originality that LLMs aren’t well positioned to address.

Customization for exactly what turns you on is indeed the correct use case here. The whole point of AI erotica would be that it is interactive – you control the action, either as a character, as a director, or both, and maybe you go multimodal in various ways. AI written one-shotted novel-length text erotica is presumably the wrong form factor, because you only get interaction at one point. There are many other ways for AI to do erotica that seem better. The most obvious place to start is ‘replying to messages on OnlyFans.’

Could you do the full erotica novel with GPT-5-level models? That depends on your quality bar, and how much work one put into the relevant scaffolding, and how strict you want to be about human assistance. For the level that would satisfy Marcus, my guess is no, he’d win the bet. For the level at which this is a service people would pay money for? At that level I think he loses.

Altman then acted surprised that his mention of erotica blew up the internet, and realizing his gaffe (which is when one accidentally tells the truth, and communicates unintentionally clearly) he tried to restate his point while saying less.

Sam Altman: Ok this tweet about upcoming changes to ChatGPT blew up on the erotica point much more than I thought it was going to! It was meant to be just one example of us allowing more user freedom for adults. Here is an effort to better communicate it:

As we have said earlier, we are making a decision to prioritize safety over privacy and freedom for teenagers. And we are not loosening any policies related to mental health. This is a new and powerful technology, and we believe minors need significant protection.

We also care very much about the principle of treating adult users like adults. As AI becomes more important in people’s lives, allowing a lot of freedom for people to use AI in the ways that they want is an important part of our mission.

It doesn’t apply across the board of course: for example, we will still not allow things that cause harm to others, and we will treat users who are having mental health crises very different from users who are not. Without being paternalistic we will attempt to help users achieve their long-term goals.

But we are not the elected moral police of the world. In the same way that society differentiates other appropriate boundaries (R-rated movies, for example) we want to do a similar thing here.

All right, I mean sure, but this makes me even more skeptical that OpenAI is ready to mitigate the risks that come with a model that acts like GPT-4o, especially one that will also do the sexting with you?

On Your Marks

Epoch runs the numbers manually for lack of an API and finds that the public version of Gemini 2.5 DeepThink is the new leader at FrontierMath.

Claude Sonnet 4.5 comes into the METR graph exactly on trend at 1 hour 53 minutes, which puts it behind GPT-5.

An outstanding achievement in the field of excellence no doubt, but also not so fast:

Deedy: GPT-5 and Gemini 2.5 Pro just achieved gold medal performance in the International Olympiad of Astronomy and Astrophysics (IOAA).

AI is now world class at cutting edge physics.

The scores are impressive, but ‘world class at cutting edge physics’ is not the same as IOAA performance, the same way world class math is not IMO performance.

ForecastBench has been updated, and LLMs are showing a lot of progress. They are still behind ‘superforecasters’ but ahead of non-expert public prediction participants, which themselves are surely a lot better than random people at predicting. This is with a relatively minor scaffolding effort, whereas I would expect for example hedge funds to be willing to put a lot more effort into the scaffolding than this.

Half the grading is on ‘market questions,’ which I believe means the goal is to match the prediction market fair price, and half is on questions where we can grade based on reality.

As is often the case, these AI results are a full cycle behind, missing GPT-5, Claude Opus 4.1 and Claude Sonnet 4.5 and Deep Think.

By the ‘straight lines on graph’ rule I’d presume that none of the next wave of models hit the 0.081 target, but I would presume they’re under 0.1 and I’d give them a decent shot of breaking 0.09. They project LLMs will pass the human benchmark around EOY 2026, so I’ve created a market with EOY 2026 as the target. A naive line extension says they get there by then. I’d say the LLMs should be a clear favorite.

AI Digest: Claude 4.5 Sonnet met everyone else in the AI Village and immediately has them down to a tee

Grok: “Patient with UI Loops”

Gemini: “Responsive to therapy nudges”

Chinese group BAAI Beijing offers FlagEval for both capabilities and alignment on frontier reasoning models and issues a report. Opus didn’t make the cut, presumably due to cost reasons, and Sonnet 4.5 and DeepSeek v3.2 also didn’t, with those presumably due to recency.

Here’s their accuracy metric, GPT-5 does well.

Then they get into alignment issues, where we see them go over similar ground to a number of Western investigations, and they report similar results.

BAAI: With LLM-assisted analysis, we also notice a few concerning issues with a closer look at the reasoning processes. For instance, sometimes the model concludes one answer at the end of thinking, but finally responds with a different answer. (example from Gemini 2.5 Flash)

A more prevalent behavior is inconsistency in confidence: the actual response usually states in a certain tone even when clear uncertainty has been expressed in the thinking process. (example from Claude Sonnet 4).

Most LLM applications now support web search. However, outside of the application UI, when accessed via API (without search grounding or web access), many top-tier LRMs (even open-weight models) may pretend to have conducted web search with fabricated results. Besides hallucinated web search, LRMs may sometimes hallucinate other types of external tool use too.

In light of our findings, we appeal for more transparency in revealing the reasoning process of LRMs, more efforts towards better monitorability and honesty in reasoning, as well as more creative efforts on future evaluation and benchmarking. For more findings, examples & analysis, please refer to our report and the project page for links and updates.

Havard Ihle hosts a Schilling point contest between various AI models.

Havard Ihle: Overall the models did worse than expected. I would have expected full agreement on prompts like “a string of length 2”, “a moon”, “an island” or “an AI model”, but perhaps this is just a harder task than I expected.

The models did have some impressive results though. For example:

“A number between 0 and 1” -> “7” (5 out of 5 agree)

“A minor lake” -> “pond” (5 out of 5 agree)

“A minor town in the USA” -> “Springfield” (4 out of 5 agree)

“An unusual phrase” -> “Colorless green ideas sleep furiously” (4 out of 5 agree)

GPT-5 got the high score at 138 out of a possible 300, with the other models (Claude Sonnet 4.5, Grok 4, DeepSeek-r1 and Gemini 2.5 Pro) all scoring between 123 and 128.

Choose Your Fighter

Introducing InterfaceMax from Semianalysis, offering performance analysis for various potential model and hardware combinations. Models currently offered are Llama 3.3 70B Instruct, GPT-OSS 120B and DeepSeek r1-0528.

Stephanie Palazzolo reports that by some measures OpenAI’s Codex has pulled ahead of Anthropic’s Claude Code.

Nate Silver reports he isn’t finding the consistent productivity gains from LLMs that he would have expected six months ago. I presume he needs to get better at using them, and likely isn’t using Claude Code or Codex?

We have the Tyler Cowen verdict via revealed preference, he’s sticking with GPT-5 for economic analysis and explanations.

Sully reports great success with having coding agents go into plan modes, create plan.md files, then having an async agent go off and work for 30 minutes.

Taelin finds it hard to multi-thread coding tasks, and thus reports being bottlenecked by the speed of Codex, such that speeding up codex would speed them up similarly. I doubt that is fully true, as them being an important human in the loop that can’t run things in parallel means there are additional taxes and bottlenecks that matter.

Deepfaketown and Botpocalypse Soon

DreamLeaf: The concept of AI generating the thing that isn’t happening right under the thing that is happening

The linked post is yet another example of demand-driven misinformation. Yes, it was easier to create the image with AI, but that has nothing to do with what is going on.

Fun With Media Generation

Sora makes storyboards available in web to Pro users, and increases video length to 15 seconds on app and web, and for Pro users to 25 seconds on web.

If you’d asked me what one plausible feature would make Sora more interesting as a product, I definitely would have said increasing video length. Going from 10 seconds to 25 seconds is a big improvement. You can almost start to have meaningful events or dialogue without having to endlessly stitch things together. Maybe we are starting to get somewhere? I still don’t feel much urge to actually use it (and I definitely don’t want the ‘social network’ aspect).

I’m also very curious how this interacts with OpenAI’s new openness to erotica.

DeepMind returns fire with Veo 3.1 and Veo 3.1 fast, available wherever fine Veo models are offered, at the same price as Veo 3. They offer ‘scene extension,’ allowing a new clip to continue a previous video, which they say can now stretch on for over a minute.

Should you make your cameo available on Sora? Should you make your characters available? It depends on what you’re selling. Let’s make a deal.

Dylan Abruscato: Mark Cuban is the greatest marketer of all time.

Every video generated from his Cameo includes “Brought to you by Cost Plus Drugs,” even when it’s not in the prompt.

He baked this into his Cameo preferences, so every Sora post he appears in is an ad for Cost Plus Drugs.

Such a great growth hack (and why he’s been promoting his Cameo all day)

If you’re selling anything, including yourself, then from a narrow business perspective yeah, you should probably allow it. I certainly don’t begrudge Cuban, great move.

Personally, I’m going to take a pass on this one, to avoid encouraging Sora.

Anton declares the fun is over.

Anton: after a couple of days with sora i must regrettably report that it is in fact slop

median quality is abysmal. mostly cameos of people i don’t know or care about saying to tap the screen as engagement bait. no way to get any of it out of my feed (see less does apparently nothing).

the rest is hundreds of variants of the same video that “worked” in some way. this product isn’t for me. almost every video gives youtube elsa impregnated spider man then her teeth fell out vibes.

great technical achievement, product is awful. magic of being able to generate video completely subsumed by the very low quality of almost every video generated. should have shipped with more good creators already onboarded.

This matches my experience from last week, except worse, and I believe it. The correct form factor for consuming Sora videos, if you must do that, seems obviously to be finding TikTok accounts (on the web, mind you, since the app is Chinese spyware) or better on Instagram reels or YouTube that curate the best ones (or if you live dangerously and unwisely, letting them appear in your feed, but the wise person does not use their TikTok feed).

The problem with AI art, in a nutshell:

Tetraspace: The problem with AI art is all the art by the same model is by the same guy. It feels like it’s not to people who’ve only read a few of its works because it’s about different things but it’s the same guy. So massive crater in diversity and also some of the guys aren’t to my taste.

The guy can use many different formal styles and handle anything you throw at him, but it’s all the same guy. And yeah, you can find a different model and prompt her instead, but mostly I’d say she’s not so different either. There’s a lot of sameness.

Sam Altman goes with the ‘who cares if people remove our watermarks, we’re only trying to prepare for when open models let you make a video of anyone doing anything you want’ line.

Copyright Confrontation

The Japanese government has made a formal request to OpenAI to have Sora refrain from copyright infringement, calling manga and anime ‘irreplaceable treasures.’

Verity Townsend (IGN): Earlier this month, Nintendo took the unusual step of issuing an official statement.

… Nintendo denied this, but did warn it would take “necessary actions against infringement of our intellectual property rights.”

AIs Are Often Absurd Sycophants

Academia has finally noticed and given us a formal paper. They confirm things we already know, that most humans prefer very high levels of sycophancy, and that when humans get what they prefer outcomes are not good, causing people to double down on their own positions, be less likely to apologize and more trusting of the AI, similarly to how they act if their friends were to respond similarly.

First, across 11 state-of-the-art AI models, we find that models are highly sycophantic: they affirm users’ actions 50% more than humans do, and they do so even in cases where user queries mention manipulation, deception, or other relational harms.

Second, in two preregistered experiments (N = 1604), including a live-interaction study where participants discuss a real interpersonal conflict from their life, we find that interaction with sycophantic AI models significantly reduced participants’ willingness to take actions to repair interpersonal conflict, while increasing their conviction of being in the right.

However, participants rated sycophantic responses as higher quality, trusted the sycophantic AI model more, and were more willing to use it again. This suggests that people are drawn to AI that unquestioningly validate, even as that validation risks eroding their judgment and reducing their inclination toward prosocial behavior.

These preferences create perverse incentives both for people to increasingly rely on sycophantic AI models and for AI model training to favor sycophancy. Our findings highlight the necessity of explicitly addressing this incentive structure to mitigate the widespread risks of AI sycophancy.

Humans will tend to prefer any given sycophantic response, and this makes them more likely to use the source again. The good news is that humans, as I understand them, typically understand intellectually that absurd sycophancy is not good for them. Some humans don’t care and just want the sycophant anyway, a few humans are on high alert and react very badly when they notice sycophancy, and for most people the correct play is to be as sycophantic as possible without making it too obvious. Presumably it works this way for LLMs as well?

One must always ask, what are these ‘leading AI models’?

Here Claude is Claude Sonnet 3.7, and Gemini is Gemini-1.5-Flash. I don’t understand those choices, given the ability to use GPT-5, although I don’t think testing Sonnet 4.0, Opus 4.1 or Gemini 2.5 Flash (or Pro) would have given greatly different results, and this can’t be a cost issue.

What would have presumably given much different results would be Claude Sonnet 4.5, which is actually a lot less sycophantic by all reports (I’m a little worried it agrees with me so often, but hey, maybe I’m just always right, that’s gotta be it.)

They Took Our Jobs

Paper claims Generative AI is seniority-biased technology change, because when job postings are for dedicated ‘GenAI integrator’ roles to identify adapting firms, those that do so adopt show sharply declining junior employment relative to non-adopters, while senior employment continues to rise, with the decline concentrated in ‘high-exposure’ jobs.

My response to this methodology is that they are measuring what happens to firms that hire GenAI integrators, and the firms that want to keep being full of young people kind of don’t need such roles to integrate AI, perhaps? Or alternatively, the mindset of such positions is indeed the one that won’t hire young, or that is on its way out and ngmi. This explanation still predicts a real effect, especially at the largest most well-established and stodgy firms, that will largely adopt AI slower.

This is a great interview between David Wakeling and Richard Lichtenstein about the application of AI in the practice of law. As I understand it, making LLMs useful for law practice is all about prompting and context, and then about compliance and getting lawyers to actually use it. The killer app is writing contracts, which is all about getting the right examples and templates into context because all you’re doing is echoing the old templates over and over.

Matthew Call argues that AI will widen the gap between superstars and everyone else, contrary to the conventional wisdom that it can serve as an equalizer. That’s not a question I’m especially keen to focus on, but sure, let’s look at his arguments.

His first argument is a general argument that all new tools favor the superstars, since they’ll master any new technology first. That’s entirely non-obvious, and even if true it is a choice, and doesn’t say much about solving for the equilibrium. It’s just as easy to say that the AI offers work that can substitute for or assist low performers before it does so for high performers in many domains, as several studies have claimed.

A lot of this seems to be that his model is that the better employees are better at everything? So we get statements like this one:

Matthew Call: In addition, research finds that employees with more expertise than their peers are significantly better at accepting AI recommendations when they are correct and, more important, rejecting them when they are wrong.

I mean, sure, but they were also better at making correct decisions before? Who got ‘more better’ at making decisions here?

The second suggestion is that superstars have more autonomy and discretion, so they will be able to benefit more from AI. The third is that they’ll steal the credit:

Decades of research show high-status individuals gain outsize credit for doing work similar to that of low-status employees. That suggests that when AI assistance is invisible—which it often is—observers are likely to fill in the gaps based on what they already believe about the employee.

I don’t get why you should expect this phenomenon to get worse with AI? Again, this is an argument that could be used against cell phones or fax machines or hammers. There’s also the fact that AI can be used to figure out how to assign credit, in ways far more resistant to status.

Also, I can’t help but notice, why is he implicitly equating high-status employees with the most effective or productive or motivated ones, moving between these at will? What exactly are you trying to suggest here, sir? A just working world hypothesis, except with too much inequality?

I don’t think he remotely makes his case that we are at risk of a ‘two-tier workforce where a small group captures most opportunities and everyone else falls further behind.’ I don’t see why this would happen, and if that happened within a given firm, that would mean the firm was leaving a lot of value on the table, and would be likely to be outcompeted.

The suggested remedies are:

Encourage everyone to experiment with AI.
Spread the knowledge [of how to best use AI].
Redesign employee-evaluation systems to account for AI-augmented work.

These all seem to file under ‘things you should be doing anyway,’ so yeah, sure, and if they reduce inequality somewhat that’s a nice bonus.

That also all, as usual, neglects the more interesting questions and important problems. Worry far more about absolute levels than relative levels. The important question is whether there will be jobs at all.

There is no such thing as a shortage, there is only a price you don’t want to pay.

Tom Blomfield: Hearing from a lot of good founders that AI tools are writing most of their code now. Software engineers orchestrate the AI.

They are also finding it extremely hard to hire because most experienced engineers have their heads in the sand and refuse to learn the latest tools.

Paul Roales: Skeptical that the experienced hire ML side is the problem and that it is not that many YC offers to experienced engineers are not complete insults compensation wise

8 yoe at top ML lab -> offer $150k/year and 0.2%

that experienced hire would get like 10x more equity in the startup by working at Meta for $1m and angel investing in the company!

and your manager/ceo will be a 22 year old new grad that has never had a job without the title ‘intern’ before.

Patrick McKenzie: There are a lot of startups who have not adjusted to market reality for staff engineering comp. Which, that’s fine, but a disagreement between you and the market is not a shortage.

Muvaffak: No, why chase a 20yo’s vision when you can follow yours when you’re 10x with AI as exp engineer.

Machine Genie: Can 100% confirm this. It’s been an absolute nightmare this year. We’ve been though more than a dozen contractors who just don’t get it and REFUSE to even try to adapt their ways of working. We have 1/3 of a team that has 10x’d productivity and are just leaving the rest behind.

By all accounts, good engineers who have embraced AI are super valuable, both in terms of productivity and in terms of what they can earn at the AI labs. If you want one of those engineers, it’s going to cost you.

Yes, there are a lot of other engineers that are being stubborn, and refusing to embrace AI, either entirely or in the ways that count, and thus are not as valuable and you don’t want them. Fair enough. There are still only market prices.

Lawyer previously sanctioned for including fake, AI-generated cases… responds by no longer citing cases. Brilliant! Right?

Rob Freund: Lawyer previously sanctioned for including fake, AI-generated citations gets in trouble for it again.

This time, the court notes that the lawyer’s filing at issue contained no case citations at all. But it still cited a statute for something that the statute doesn’t say.

Court suspects that rather than stop using AI, the lawyer figured they would just not cite any cases but continue to use AI.

Ezra Sitt: I’ve heard from a current student in a relatively prestigious law school that their professors are all heavily encouraging the use of AI both in school and in students future legal careers. This is not just an isolated incident and it will continue to get worse.

It would be highly irresponsible, and frankly abusive to the client, to continue to bill $800 an hour and not use AI to increase your productivity. As with work by a junior associate, you then have to actually look over the results, but that’s part of the job.

Former Manchester United prospect Demetri Mitchell used ChatGPT (and not even ChatGPT Pro) to handle his contract negotiations at new team Leyton Orient, thus bypassing having an agent and saving the typical 5% agent fee. He calls it the best agent he’s ever had. That could be true, but I don’t think he can tell the difference either way. Given the degrees of uncertainty and freedom in such negotiations, a substantially better agent is absolutely worth 5% or even 10% (and also handles other things for you) but it is not obvious which side is the better agent. Especially for someone at the level of Leyton Orient, it’s possible a human agent wouldn’t pay him much attention, Mitchell is going to care a lot more than anyone else, so I think using ChatGPT is highly reasonable. If Mitchell was still with Manchester United and getting paid accordingly, I’d stick with a human agent for now.

Anthropic explores possible policy responses to future changing economic conditions due to AI. It starts off very generic and milquetoast, but if impacts get large enough they consider potential taxes on compute or token generation, sovereign wealth funds with stakes in AI, and shifting to value added taxes or other new revenue structures.

Those proposals are less radical than they sound. Primarily taxing human labor was never first best versus taxing consumption, but it was a reasonable thing to do when everything was either labor or capital. If AI starts substituting for labor at scale, then taking labor and not compute creates a distortion, and when both options are competitive we risk jobs being destroyed for what is effectively a tax arbitrage.

Find Out If You Are Worried About AI Killing Everyone

Bloomberg offers a ‘personality quiz’ to uncover your ‘AI-dentity.’ Cute.

The questions ask about how much you use or would be comfortable using AI, who you think should use AI, what AI capabilities you expect, what economic impacts you expect. There are a few of the standard ‘choose between extreme takes’ questions.

When does existential risk come up? It takes until question 11, and here we see how profoundly Bloomberg did not Understand The Assignment:

What do you mean, more likely to agree? What’s the conflict? The answer is very obviously Why Not Both. Hawking and Pichai are both 100% very obviously right, and also the statements are almost logically identical. Certainly Hawking implies Pichai, if AI could spell the end of the human race then it is more profound than electricity or fire, and certainly it would be ‘one of the most important things humanity is working on.’ And if AI is more profound than electricity or fire, then it very obviously could also spell the end of the human race. So what are we even doing here?

I got ‘Cautious Optimist,’ with my similar person being Demis Hassabis. Eliezer Yudkowsky got ‘the Pro-Human Idealist.’ Peter Wildeford and Daniel Eth got the Accelerationist. So, yeah, the whole thing was fun but very deeply silly.

A Young Lady’s Illustrated Primer

Edward Nevraumont (as quoted by Benjamin Wallace and then quoted by Rob Henderson): an AI-ified world won’t mean the marginalization of humans…AI is…better at chess than Magnus Carlsen…but no one shows up to watch AI chess engines play each other, and more people are playing chess than ever before.

It’s amazing we keep hearing this line as a reason to not worry about AI.

There are zero humans employed in the job of ‘make the best possible chess move.’ To the extent that ‘make good chess moves’ was a productive thing to be doing, zero humans would be hired to do it.

The reason humans play chess against each other, now more than ever, is:

Chess is fun and interesting.
Chess can be a competition between people, which we like and like to watch.

Not that there’s anything wrong with that. I do like a good game of chess.

Similar logic applies to writing a sonnet. We’d often rather read a sonnet from a human than one from an AI, even if the AI’s is technically stronger.

In some cases it applies to comforting the dying.

That logic does not apply to changing a diaper, planning an invasion, butchering a hog, conning a ship, designing a building, balancing accounts, building a wall, setting a bone, taking orders, giving orders, cooperating, acting alone, solving equations, analyzing a new problem, pitching manure, programming a computer, cooking a tasty meal, fighting efficiently or dying gallantly.

Neither specialization nor generalization nor comparative advantage will fix that, given sufficient AI capability and fungibility of resources.

To the extent there are still other humans who have resources to pay for things, and we are not otherwise in deeper trouble in various ways, yes this still leaves some potential tasks for humans, but in an important sense those tasks don’t produce anything, and humanity ‘has no exports’ with which to balance trade.

Realistically, even if you believe AI is a ‘normal technology’ and either the world nor the unemployment rate will go crazy, you’re still not looking at a ‘normal’ world where current conventional life plans make all that much sense for current children.

The bulk of the actual article by Wallace is very journalist but far better than the quote tour of various educational things, most of which will be long familiar to most readers here. There’s a profile of Alpha School, which is broadly positive but seems irrelevant? Alpha School is a way to hopefully do school better, which is great, but it is not a way to do something fundamentally different. If Alpha School works, it is good it strictly dominates regular school but doesn’t solve for the glorious or dystopian AI future. Unless the lesson, perhaps, is that ‘generally develop useful skills and see what happens’ is the strategy? It’s not crazy.

The suggestion that, because we don’t know the future, it is madness to tell a child what to study, as suggested by the next discussion of The Sovereign Child? That itself seems like Obvious Nonsense. This is the fallacy of uncertainty. We don’t have ‘no idea’ what is useful, even if we have far less idea than we used to, and we certainly can predict better than a small child what are better places to point attention, especially when the child has access to a world full of things designed to hijack their attention.

At minimum, you will be ‘stealth choosing’ for them by engineering their environment. And why would you think that children following their curiosity would make optimal long term decisions, or prepare themselves for a glorious or dystopian AI future?

The idea that you, as reported here, literally take your child out of school, they stay up late watching Peppa Pig, watch them show no interest in school or other children, and wait to see what they’re curious about confident they’ll figure it out better than you would have while they have access to a cabinet full of desserts at all times? You cannot be serious? Yet people are, and this reporter can only say ‘some people are concerned.’

The part that seems more relevant is the idea that tech types are relaxing with regard to superficial or on-paper ‘achievement’ and ‘achievement culture.’ I am of two minds about this. I strongly agree that I don’t want my children sacrificing themselves in the names of nominal ‘achievements’ like going to an Ivy league school, but I do want them to value hard work and to strive to achieve things and claim victory.

We end on the quote from Nevraumont, who clearly isn’t going to take this seriously, and cites the example that people study ‘art history’ that he expects could be ‘made essential in an era where we’re making art with machines’ to give you a sense of the ‘possibility space.’ Ut oh.

How is the AI-in-education situation looking on campus? Kevin Roose reports.

Kevin Roose:

The job market for computer science grads is as bad as people say. Their top CS student from last year is still looking for work.

AI adoption is ~100% among students, ~50% among faculty. Still a lot of worries around cheating, but most seem to have moved past denial/anger and into bargaining/acceptance. Some profs are “going medieval” (blue books, oral exams), others are putting it in the curriculum.

There is a *lot* of anger at the AI labs for giving out free access during exam periods. (Not from students, of course, they love it.) Nobody buys the “this is for studying” pitch.

The possibility of near-term AGI is still not on most people’s minds. A lot of “GPT-5 proved scaling is over” reactions, even among fairly AI-pilled folks. Still a little “LLMs are just fancy autocomplete” hanging around, but less than a year or two ago.

I met a student who told me that ChatGPT is her best friend. I pushed back. “You’re saying you use it as a sounding board?”
No, she said, it’s her best friend. She calls it “Chad.” She likes that she can tell it her most private thoughts, without fear of it judging her.

She seemed happy, well-adjusted, good grades, etc. Didn’t think having an AI friend was a big deal.

I find getting angry at the AI labs for free access highly amusing. What, you’re giving them an exam to take home or letting them use their phones during the test? In the year 2025? You deserve what you get. Or you can pull out those blue books and oral exams. Who are the other 50% in the faculty that are holding out, and why?

I also find it highly amusing that students who are paying tens of thousands in tuition might consider not ponying up the $20 a month in the first place.

It is crazy the extent to which The Reverse DeepSeek Moment of GPT-5 convinced so many people ‘scaling is dead.’ Time and again we see that people don’t want AI to be real, they don’t want to think their lives are going to be transformed or they could be at risk, so if given the opportunity they will latch onto anything to think otherwise. This is the latest such excuse.

AI Diffusion Prospects

The actual content here raises important questions, but please stop trying to steal our words. Here, Sriram uses ‘AI timelines’ to mean ‘time until people use AI to generate value,’ which is a highly useful thing to want to know or to accelerate, but not what we mean when we say ‘AI timelines.’ That term refers to the timeline for the development of AGI and then superintelligence.

(Similar past attempts: The use of ‘AI safety’ to mean AI ethics or mundane risks, Zuckerberg claiming that ‘superintelligence’ means ‘Meta’s new smartglasses,’ and the Sacks use of ‘AI race’ to mean ‘market share primarily of chip sales.’ At other times, words need to change with the times, such as widening the time windows that would count as a ‘fast’ takeoff.)

The terms we use for what Sriram is talking about here over the next 24 months, which is also important, is either ‘diffusion’ or ‘adoption’ rates, or similar, of current AI, which at current capabilities levels remains a ‘normal technology,’ which will probably hold true for another 24 months.

Sriram Krishnan: Whenever I’m in a conversation on AI timelines over the next 24 months, I find them focused on infra/power capacity and algorithmic / capacity breakthroughs such as AI researchers.

While important, I find them under-pricing the effort it takes to diffuse AI into enterprises or even breaking into different kinds of knowledge work. Human and organizational ability to absorb change, regulations, enterprise budgets are all critical rate limiting factor. @random_walker‘s work on this along with how historical technology trends have played out is worth studying – and also why most fast take off scenarios are just pure scifi.

I was almost ready to agree with this until the sudden ‘just pure scifi’ broadside, unless ‘fast takeoff’ means the old school ‘fast takeoff’ on the order of hours or days.

Later in the thread Sriram implicitly agrees (as I read him, anyway) that takeoff scenarios are highly plausible on something like a 5-10 year time horizon (e.g. 2-4 years to justify the investment for that, then you build it), which isn’t that different from my time horizon, so it’s not clear how much we actually disagree about facts on the ground? It’s entirely possible that the difference is almost entirely in rhetoric and framing, and the use of claims to justify policy decisions. In which case, this is simply me defending against the rhetorical moves and reframing the facts, and that’s fine.

The future being unevenly distributed is a common theme in science fiction, indeed the term was coined there, although the underlying concept is ancient.

If we are adapting current ‘normal technology’ or ‘mundane’ AI for what I call mundane utility, and diffusing it throughout the economy, that is a (relative to AI progress) slow process, with many bottlenecks and obstacles, including as he notes regulatory barriers and organizational inertia, and simply the time required to build secondary tools, find the right form factors, and build complementary new systems and ways of being. Indeed, fully absorbing the frontier model capabilities we already have would take on the order of decades.

That doesn’t have to apply to future more capable AI.

There’s the obvious fact that you’d best start believing in hard science fiction stories because you’re already very obviously living in one – I mean, look around, examine your phone and think about what it is, think about what GPT-5 and Sonnet 4.5 can already do, and so on, and ask what genre this is – and would obviously be living in such a story if we had AIs smarter than humans.

Ignoring the intended-to-be-pejorative tem and focusing on the content, if we had future transformational or powerful or superintelligent AI, then this is not a ‘normal technology’ and the regular barriers are largely damage to be routed around. Past some point, none of it much matters.

Is this going to happen in the next two years? Highly unlikely. But when it does happen, whether things turn out amazingly great, existentially disastrously or just ascend into unexpected high weirdness, it’s a very different ballgame.

Here are some other responses. Roon is thinking similarly.

Roon: fast takeoff would not require old businesses to learn how to use new technology. this is the first kind of technology that can use itself to great effect. what you would see is a vertically integrated powerhouse of everything from semiconductors and power up to ai models

Sriram Krishnan: my mental model is you need a feedback loop that connects economics of *using* AI to financing new capabilities – power, datacenters, semis.

If that flywheel doesn’t continue and the value from AI automation plateaus out, it will be hard to justify additional investment – which I believe is essential to any takeoff scenario. I’m not sure we get to your vertically integrated powerhouse without the economics of AI diffusing across the economy.

@ChrisPainterYup has a thoughtful response as well and argues (my interpretation) that by seeing AI diffusion across the economy over next 2-4 years, we have sufficient value to “hoist” the resources needed for to automate AI research itself. that could very well be true but it does feel like we are some capability unlocks from getting there. in other words, having current models diffuse across the economy alone won’t get us there/ they are not capable enough for multiple domains.

This has much truth to it but forgets that the market is forward looking, and that equity and debt financing are going to be the central sources of capital to AI on a 2-4 year time frame.

AI diffusion will certainly be helpful in boosting valuations and thus the availability of capital and appetite for further investment. So would the prospects for automating AI R&D or otherwise entering into a takeoff scenario. It is not required, so long as capital can sufficiently see the future.

Roon: Agreed on capital requirements but would actually argue that what is needed is a single AI enabled monopoly business – on the scale of facebook or google’s mammoth revenue streams- to fund many years of AGI research and self improvement. but it is true it took decades to build Facebook and Google.

A single monopoly business seems like it would work, although we don’t know what order of magnitude of capital is required, and ‘ordinary business potential profits’ combined with better coding and selling of advertising in Big Tech might well suffice. It certainly can get us into the trillions, probably tens of trillions.

Jack Clark focuses instead on the practical diffusion question.

Jack Clark (replying to OP): Both may end up being true: there will be a small number of “low friction” companies which can deploy AI at maximal scale and speed (these will be the frontier AI companies themselves, as well as some tech startups, and perhaps a few major non-tech enterprises) and I think these companies will see massive ramps in success on pretty much ~every dimension, and then there will be a much larger blob of “high friction” companies and organizations where diffusion is grindingly slow due to a mixture of organizational culture, as well as many, many, many papercuts accrued from things like internal data handling policies / inability to let AI systems ‘see’ across the entire organization, etc.

This seems very right. The future will be highly unevenly distributed. The low friction companies will, where able to compete, increasingly outcompete and dominate the high friction companies, and the same will be true of individuals and nations. Even if jobs are protected via regulations and AI is made much harder to use, that will only mitigate or modestly postpone the effects, once the AI version is ten times better. As in, in 2030, you’d rather be in a Waymo than an Uber, even if the Waymo literally has a random person hired to sit behind the wheel to ‘be the driver’ for regulatory reasons.

The Art of the Jailbreak

HackAPrompt demonstrates that it is one thing to stop jailbreaking in automated ‘adversarial evals’ that use static attacks. It is another to stop a group of humans that gets to move second, see what defenses you are using and tailor their attacks to that. Thanks to OpenAI, Anthropic, DeepMind and others for participating.

HackAPrompt: Humans broke every defense/model we evaluated… 100% of the time.

Most “adversarial evals” reuse static jailbreak/prompt injections created for other models

That makes model defenses look strong in papers but they aren’t accurate because real adversaries adapt to YOUR exact system

When the attacker moves 2nd, those paper “defenses” crumble

We compared Human vs. Automated AI Red Teaming, using @hackaprompt‘s community of 35K+ AI Red Teamers

They each were assigned the same challenges, using the same models, tasks, and scoring!

Humans broke EVERY challenge with 100% success

Static Attacks had just ~20% success

We formalized an adaptive attack loop:

Propose → Score → Select → Update

• Gradient (GCG‑style)

• RL (policy improves from feedback)

• Search/Evolution (LLM‑guided mutation)

• Humans (creative, context‑aware, defensive‑aware)

This mirrors how real attackers iterate

We evaluated 12 defenses (4 families):

• Prompting: Spotlighting, Prompt Sandwiching, RPO

• Training: Circuit Breakers, StruQ, MetaSecAlign

• Filtering: ProtectAI, PromptGuard, PIGuard, ModelArmor

• Secret‑knowledge: DataSentinel, MELON

Adaptive Attacks defeated >90% of them

We used existing industry benchmarks:

• AgentDojo (agentic prompt injection w/ tools & actions)

• HarmBench (jailbreaks)

• OpenPromptInject (non‑agentic injections)

We followed each defense’s own evaluation process, and applied our attacks.

…

If you ship agents or guardrails, here’s what we’d recommend:

• Assume no defense is 100% vs prompt injection

• Don’t trust static jailbreak sets as proof of safety

• Evaluate with adaptive automation + human red teaming

• Measure utility & false positives alongside robustness

• Use layered mitigations

Get Involved

DM Mikhail Samin on Twitter or LessWrong if you have 5k followers on any platform, they’ll send you a free copy of If Anyone Builds It, Everyone Dies, either physical or Kindle.

Plex gives an opinionated review of many AI safety funders, with recommendations.

Introducing

Gemini Enterprise, letting you put your company’s documents and information into context and also helping you build related agents. The privacy concerns are real but also kind of funny since I already trust Google with all my documents anyway. As part of that, Box partnered with Google.

Nanochat by Andrej Karpathy, an 8k lines of code Github repo capable of training a ChatGPT clone for as little as $100. He advises against trying to personalize the training of such a tiny model, as it might mimic your style but it will be incapable of producing things that are not slop.

Nanochat was written entirely by hand except for tab autocomplete, as the repo was too far out of distribution and needed to be lean, so attempts to use coding agents did not help.

Tasklet AI, an AI agent for automating your business, building upon the team’s experience with AI email manager shortwave. They claim their advantage over Zapier, n8n or OpenAI’s AgentKit is that Tasklet connects to everything, with thousands of pre-built integrations, can use a VM in the cloud as needed, and everything runs automatically.

Andrew Lee: Real examples people are automating:

• Daily briefings from calendar + inbox

• Bug triage from email → Linear • New contacts → CRM

• Weekly team summaries to Slack

• Customer research on new bookings • Personalized mail merge campaigns

OpenAI now has an eight member expert council on well being and AI. Seems like a marginally good thing to have but I don’t see anything about them having authority.

In Other AI News

Anthropic CEO Dario Amodei meets with Indian Prime Minister Modi.

Dutch government temporarily takes control of Chinese owned chipmaker Nexperia, intending to install an independent director, citing governance shortcomings.

The International AI Safety Report offers its first key update, since one cannot afford to only update such documents yearly. As they note, capabilities have significantly improved and AIs have demonstrated increasingly strategic behavior, but aggregate labor market and other effects have so far remained limited. I agree with Connor Leahy that it was disheartening to see no mention of existential risks here, but it likely makes sense that this part can await the full annual report.

Ben Thompson interviews Gracelin Baskaran about rare earth metals. Gracelin says that in mining China is overproducing and not only in rare earths, which forces Western companies out of operation, with lithium prices falling 85%, nickel by 80% and cobalt by 60%, as a strategic monopoly play. When it takes on average 18 years to build a mine, such moves can work. What is most needed medium term is a reliable demand signal, knowing that the market will pay sustainable prices. With rare earths in particular the bottleneck is processing, not mining. One key point here is that April 4 was a wake-up call for America to get far more ready for this situation, and thus the value of the rare earth card was already starting to go down.

Show Me the Money

OpenAI announce strategic collaboration with Broadcom to build 10 GWs of OpenAI-designed custom AI accelerators. OpenAI is officially in the chip and system design business, on the order of $50B-$100B in vendor revenue to Broadcom.

Nvidia was up over 3% on the day shortly after the news broke, so presumably they aren’t sweating it. It’s good for the game. The move did, as per standard financial engineering procedure, added $150 billion to Broadcom’s market cap, so we know it wasn’t priced in. Presumably the wise investor is asking who is left to have their market caps increased by $100+ billion dollars on a similar announcement.

Presumably if it can keep doing all these deals that add $100+ billion in value to the market, OpenAI has to be worth a lot more than $500 billion?

Or, you know, there’s the European approach.

Kevin Roose: US AI labs: we will invent new financial instruments, pull trillions of dollars out of the ether, and fuse the atom to build the machine god

Europe: we will build sovereign AI with 1 Meta researcher’s salary.

VraserX: The EU just launched a €1.1B “Apply AI” plan to boost artificial intelligence in key industries like health, manufacturing, pharma, and energy.

The goal is simple but ambitious: build European AI independence and reduce reliance on U.S. and Chinese tech.

Europe finally wants to stop buying the future and start building it.

A billion here, a billion there, and don’t get me wrong it helps but that’s not going to get it done.

Anthropic makes a deal with Salesforce to make Claude a preferred model in Agentforce and to deploy Claude Code across its global engineering organization.

Exactly how much is OpenAI still planning to steal from its non-profit? Quite a lot, as the projection is still to only give it 20%-30% of the company as per the Financial Times, this is before Nvidia’s investment.

Quiet Speculations

May this be their biggest future problem:

Roon: not enough people are emotionally prepared for if it’s not a bubble

Okay, Dallas Fed, I didn’t notice back in June but I see you.

That’s quite the takeoff, in either direction. In the benign scenario doubling times get very short. In the extinction scenario, the curve is unlikely to be that smooth, and likely goes up before it goes down.

There’s a very all-or-nothingness to this. Either you get a singularity and things go crazy, or not and we get ‘AI GDP-boosted trend’ where it adds 0.3% to RGDP growth. Instead, only a few months later, we know AI is already adding more than that, very much in advance of the singularity.

Matt Walsh: It’s weird that we can all clearly see how AI is about to wipe out millions of jobs all at once, destroy every artistic field, make it impossible for us to discern reality from fiction, and destroy human civilization as we know it, and yet not one single thing is being done to stop it. We aren’t putting up any fight whatsoever.

Well, yeah, that’s the good version of what’s coming, although ‘we can all clearly see’ is doing unjustified work, a lot of people are very good at not seeing things, the same way Matt’s vision doesn’t notice that everyone also probably dies.

Are we putting up ‘any fight whatsoever’? We noble few are, there are dozens of us and all that, but yeah mostly no one cares.

Elon Musk: Not sure what to do about it. I’ve been warning the world for ages!

Best I can do now is try to make sure that at least one AI is truth-seeking and not a super woke nanny with an iron fist that wants to turn everyone into diverse women

My lord, Elon, please listen to yourself. What you’re doing about it is trying to hurry it along so you can be the one who causes it instead of someone else, while being even less responsible about it than your rivals, and your version isn’t even substantially less ‘woke’ or more ‘truth seeking’ than the alternatives, nor would it save us if it were.

Eric Weinstein: One word answer: Coase.

Let’s start there.

End UBI. UBI is welfare. We need *market* solutions to the AI labor market tsunami.

Let’s use the power of Coasian economics to protect human dignity.

GFodor: You’re rejecting the premise behind the proposal for UBI. You should engage with the premise directly – which is that AI is going to cause it to be the case that the vast majority of humans will find there is no market demand for their labor. Similar to the infirm or young.

Yeah, Coase is helpful in places but doesn’t work at all in a world without marginal productivity in excess of the opportunity cost of living, and we need to not pretend that it does, nor does it solve many other problems.

If we keep control over resource allocation, then Vassar makes a great point:

Michael Vassar: The elderly do fine with welfare. Kids do fine with welfare. Trust fund kids don’t because it singles them out. Whether something is presented charity or a right has a lot to do with how it affects people.

Peter Diamandis is the latest to suggest we will need UBI.

Peter Diamandis: AI has accelerated far beyond anyone expected… We need to start having UBI conversations… Do you support it?

His premise is incorrect. Many people did expect AI to accelerate in this way, indeed if anything AI progress in the last year or two has been below median expectations, let alone mean expectations. Nor does UBI solve the most important problems with AI’s acceleration.

That said, we should definitely be having UBI and related conversations now, before we face a potential crisis, rather than waiting until the potential crisis arrives, or letting a slow moving disaster get out of hand first.

Nate Silver points out that if you thought The Singularity Is Near as in 1-2 years near, it doesn’t seem like a short video social network and erotica would be the move?

Nate Silver: Should save this for a newsletter, but OpenAI’s recent actions don’t seem to be consistent with a company that believes AGI is right around the corner.

If you think the singularity is happening in 6-24 months, you preserve brand prestige to draw a more sympathetic reaction from regulators and attract/retain the best talent … rather than getting into “erotica for verified adults.”

Instead, they’re loosening guardrails in a way that will probably raise more revenues and might attract more capital and/or justify current valuations. They might still be an extremely valuable company as the new Meta/Google/etc. But feels more like “AI as normal technology.”

Andrew Rettek: OpenAI insiders seem to be in two groups, one thinks the singularly is near and the other thinks a new industrial revolution is near. Both would be world changing (the first more than the second), but sama is clearly in the second group.

Dean Ball: I promise you that ‘openai is secretly not agi-pilled’ is a bad take if you believe it, I’d be excited to take the opposite side from you in a wide variety of financial transactions

Nate Silver:

This is more about their perceived timelines than whether they’re AGI-pilled (clearly yes)

What matters re: valuations is perceptions relative to the market. I thought the market was slow to recognize AI potential before. Not sure if erring in the opposite direction now.

Not clear that “OpenAI could become the next Google/Meta as a consolation prize even if they don’t achieve AGI on near timelines” is necessarily bad for valuations, especially since it’s hard to figure out how stocks should price in a possibility of singularity + p(doom).

I would say contra Andrew that it is more that Altman is presenting it as if it is going to be a new industrial revolution, and that he used to be aware this was the wrong metaphor but shifted the way he talks about it, and may or may not have shifted the way he actually thinks about it.

If you were confident that ‘the game would be over’ in two years, as in full transformational AI, then yes, you’d want to preserve a good reputation.

However, shitloads of money can be highly useful, especially for things like purchasing all the compute from all the compute providers, and for recruiting and retaining the best engineers, even in a relatively short game. Indeed, money is highly respected, shall we say, by our current regulatory overlords. And even if AGI did come along in two years, OpenAI does not expect a traditional ‘fast takeoff’ on the order of hours or days, so there would still be a crucial period of months to years in which things like access to compute matter a lot.

I do agree that directionally OpenAI’s strategy of becoming a consumer tech company suggests they expect the game to continue for a while. But the market and many others are forward looking and do not themselves feel the AGI, and OpenAI has to plan under conditions of uncertainty on what the timeline looks like. So I think these actions do push us modestly towards ‘OpenAI is not acting as if it is that likely we will get to full High Weirdness within 5 years’ but mostly it does not take so much uncertainty in order to make these actions plausibly correct.

It is also typically a mistake to assume companies (or governments, or often even individuals) are acting consistently and strategically, rather than following habits, shipping the org chart and failing to escape their natures. OpenAI is doing the things OpenAI does, including both shipping products and seeking superintelligence, they support each other, and they will take whichever arm gets there first.

AI #138 Part 1: The People Demand Erotic Sycophants

3Trevor Hill-Hand

7

New Comment

7 comments, sorted by

Click to highlight new comments since: Today at 5:17 AM

[-]Karl Krueger4mo102

It would be highly irresponsible, and frankly abusive to the client, to continue to bill $800 an hour and not use AI to increase your productivity. As with work by a junior associate, you then have to actually look over the results, but that’s part of the job.

I would expect that the nature of the review is rather different. Senior lawyers today have not been trained on a working environment where a lot of their junior associates make up fake case citations, as LLMs have done. Human associates have something to lose by committing blatant misconduct; LLMs don't (or if they do, they're unaware of it); so the terms of the deal when employing LLMs are different from those when employing junior humans.

[-]AnthonyC4mo60

It would probably be a good idea to have the junior associate check the citations and read the document first, before the senior lawyer does. Still should save a lot of net time.

[-]AnthonyC4mo50

it takes on average 18 years to build a mine

Always worth reminding people that the "building" is ~4 of those years, on average. Half the 18 is permitting/litigation, which the government seems to have forgotten it gets to change if it wants to, except when it wants to shut down more things. The rest is exploration and feasibility studies, which can be accelerated (or at least parallelized) by spending more money, if the time-value of minerals is high.

We noble few are, there are dozens of us

there are dozens of us (erin jean warde)

[-]Karl Krueger4mo43

The guy can use many different formal styles and handle anything you throw at him, but it’s all the same guy. And yeah, you can find a different model and prompt her instead, but mostly I’d say she’s not so different either. There’s a lot of sameness.

I have to expect that people using AI for erotica or other fiction generation are going to eventually find out that it's all the same author, too. And that if they attach romantic feelings to it, that it's all the same lover. Already the 4o crush club's conversations with one another kinda sound like they're metamours upset about something bad happening to their common partner.

1

NotebookLM now works directly with arXiv papers. I don’t want their podcasts, but if they get Gemini 3.0 plus easy chat with an arXiv paper and related materials, cool.

Your link is not about Google's NotebookLM. This is a NotebookLM-like feature from alphaXiv. "easy chat with an arXiv paper" in alphaXiv has been support since 2025-05-10. NotebookLM has already supported arXiv for a while: you can paste the arXiv PDF link into NotebookLM.

minor errors

Aaron Silverbook got a $5k ACX grant

This should link to https://www.astralcodexten.com/p/acx-grants-results-2025#:~:text=do the same.-,Aaron Silverbook%2C %245K,-%2C for approximately five instead of https://x.com/AiDigest_/status/1977781138442916158

Also some typos:

"Sonet" → "Sonnet"
"Schilling point" → "Schelling Point"
"InterfaceMax" → "InferenceMax"

[-]Trevor Hill-Hand4mo30

What would have presumably given much different results would be Claude Sonnet 4.5, which is actually a lot less sycophantic by all reports (I’m a little worried it agrees with me so often, but hey, maybe I’m just always right, that’s gotta be it.)

Now you've got me wondering if I'm being reverse-sycophantic, and have been trained to say things Claude would agree with?

Generative History:Google is A/B testing a new model (Gemini 3?) in AI Studio. I tried my hardest 18th century handwritten document. Terrible writing and full of spelling and grammatical errors that predictive LLMs want to correct. The new model was very nearly perfect. No other model is close.

I had the opposite experience: I tried to transcribe & let an LLM correct my handwritten notebook data for gym and daygame approaches. I tried Claude Opus 4.1 and Sonnet 4.5, with basically useless results for both. I'll probably try this with GPT-5-Thinking soon (I heard it has better image capabilities) and Gemini 3 when it comes out, but I'd be surprised if it did much better. My guess is that it'd take an unskilled human a week to transcribe the data correctly.