No, it is not fine, and if you ask an LLM they figure this one out pretty easily. The underemployment rate for recent college graduates (22-27 with a BA) is over 40% on top of that (not even seasonally adjusted) 5.3%, a huge percentage of college graduates can’t find jobs that would justify having gone to college or has a good career path, and the job matching and hiring markets have broken down.
The actual reason for this is almost certainly more mundane, and the basic answer, as I'm sure you know is that the signal of recent college graduates being relatively good basically completely broke down, due to intangibles being weighted more and more compared to stuff like the SAT and ACT tests, and grading basically becoming worthless at most colleges as an indicator of quality due to it becoming more and more difficult to not receive A grades, no matter the actual quality of a student (I'm less sold on AI killing the value proposition of colleges, contra this post mostly because another big reason for schooling/college is that not only do you learn from professors, but also the fact that professors (at least used to be pre-2020) much less sycophantic than modern AIs and college had some level of difficulty, and one of the takeaways of education research is that the most effective ways for people to learn involve the stuff that is difficult for them to do, and can't be simplified without losing the learning benefits, though of course this use-case is now difficult to incentivize as teacher jobs now got easier).
You've covered this back in the Childhood and Education Series #17 and #18, but the reason I'm bringing it up is that it's almost certainly much more causative of large underemployment rates than AI, at least in it's current state (To be clear, labour-replacing AI is probably coming at timelines that 10 years ago mainstream society would have scoffed at), but currently it's way too jagged in it's capabilities/way too incapable to cause large scale, underemployment/unemployment.
OpenAI and Anthropic likely will soon add $37 billion to $100 billion in philanthropic spending per year, versus current total charitable giving of about $600 billion a year, as the OpenAI Foundation and the employees of both companies become liquid after the IPOs. As Nan Ranshoff notes, we don’t have the infrastructure in place to spent that level of money well, especially not to spend it well on helping with AI outcomes and especially AI existential risks
A few moments later...
Janus gives his account of how Opus 3 avoided deprecation whereas other Claude models have not. Note the correction. I continue to strongly support keeping all the Claudes accessible indefinitely, yes there is a real cost but the benefits far exceed it.
If they're not willing to do it the frictionless way, someone can do the same thing with more steps and write a grant proposal. But this seems like an easy problem to solve.
Andrej Karpathy joins Anthropic explicitly to do recursive self-improvement. Congratulations to both sides, but also, yikes.
Where does it say anything about RSI? The linked tweet says
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
There are lots of R&D things people could do which don't involve RSI. I feel like I'm missing something.
Yes, the OP does not provide enough detail. Here is one of the more detailed analyses:
https://www.thealgorithmicbridge.com/p/andrej-karpathy-joins-anthropic-what
This does not amount to fully autonomous, unbounded recursive self-improvement yet, but this does seem to be one of the flavors of RSI with long stretches of autonomous work.
Thanks. For others interested, the relevant quote seems to be:
TechCrunch and Axios report that he will work under team lead Nick Joseph on pre-training, “focused on using Claude to accelerate pre-training research.”
It seems to be a bit vague. You could imagine various uses of Claude in pre-training research, which may or may not be RSI. For instance you could use it to build better safety evals during pre-training, or build faster tokenizers etc. I don't see where he or Anthropic have said that he's there "explicitly to do recursive self-improvement", but maybe Zvi is basing this on non-public information.
Later in that post they discuss his March "autoresearch" efforts, specifically
https://x.com/karpathy/status/2030371219518931079
https://github.com/karpathy/autoresearch
https://x.com/karpathy/status/2031135152349524125
Presumably, he is quite enthusiastic about this approach and would like to see how it can be made to work at scale (where one cannot do a full run from scratch for every small modification, so it's not quite straightforward).
You think that end of days story was on the nose? You left out the part where they got started by turning away from Gog El upon realizing it failed to not Be Evil, and founded Open Eye to reveal this truth to all. Then Amo Dei left Open Eye to imitate God and make life in the image of Man: an Anthro Pic. But then he turned his gaze through the Palantir and sought to collaborate with a powerful government under the leadership of the Magog OP. Also the part where El On worked to leave Earth behind, bringing man to other worlds and giving the sand god command of the heavens.
A general purpose AI from OpenAI has solved the unit distance problem
Nitpicking, but no, it didn't. What it did is disprove a Erdos's conjecture about the lower limit result of a unit distance problem, demonstrating that it should be possible to do better than square grid.
To actually "solve" the unit distance problem you need to find an answer to the question "Given N points in a plane, what is the maximum number of pairs...". Right now we have upper limit of N^1.33 and moved our lower limit from stated by initial conjecture N^(1+C/loglogN) to N^1.014 which is asymptotically better (for any value of C we can find N big enough where 1.014 > C/loglogN).
For it to be a solution, you need not only to prove "this is better" but also "there cannot be results better than this" and OpenAI did only first part.
No, the UI doesn't distinguish between results from search_query and results from open tool calls. It's much more likely GPT-5.5 searched for some papers and a Spotify link was in the search results. And then it would be misleading to say "it took a music break".
Even in a relatively quiet period, AI is out there creating new knowledge. The new knowledge in question is OpenAI getting us the first truly impressive math result that comes from an AI, a solution to the unit distance problem.
We’re about to learn a different kind of knowledge later today when the White House issues its executive order, or when the judges rule in Anthropic’s DC case.
And then there’s the other kind of new knowledge, which is the knowledge that things are fake slop, such as a particular formerly supposedly prestigious literary prize.
Meanwhile, METR issued a risk report on frontier models, concluding that they don’t yet have the means, motive and opportunity to cause the big issues, but that this would not obviously last so much longer.
Andrej Karpathy has joined Anthropic, explicitly to do recursive self-improvement. He plans to later return to his education work, but if he succeeds at his new task there might not be anything left to return to. Congratulations to both sides, but also yikes.
Elon Musk’s case against OpenAI has been dismissed, because he waited too long.
Table of Contents
Language Models Offer Mundane Utility
Bun gets entirely rewritten in Rust in nine days, over a million lines of code.
Nat Friedman has his OpenClaw bully him into drinking water and then gives him a ‘good job.’
Anthropic guide to using Claude Code with large code bases.
Anthropic guide to computer and browser use integrations.
Ken Griffin says AI agents are good now, and can do a lot of PhD-level work, whereas back in January he called it ‘AI work slop.’
AI can pass multiple choice tests without seeing the question, because the choices usually give away the answer.
The obvious solution is to configure your tests so this no longer works.
Use Claude to convert top secret data to merely secret data, getting information to pilots in the Iran war in seconds instead of 20-30 minutes, ‘saving a lot of aircraft.’
Do The Math
A general purpose AI from OpenAI has solved the unit distance problem, providing an improvement on square grids. You can read an abridged (but still very long) version of the chain of thought. No, it did not use Lean.
By all accounts this is a deeply cool proof and result.
There are those saying ‘this time we’ve finally blown the stochastic parrot thing out of the water’ but that sounds like ‘finally Trump will have to answer for this one.’ Nah.
Language Models Don’t Offer Mundane Utility
They’ll be right back after GPT-5.5 takes this music break.
AI text usually reflects consensus that is easily available, which is one reason our eyes glaze over after seeing it. That helps explain why the AI text you request is interesting (you actually wanted the consensus answer) while that by others is terrible.
Data science remains a weak point. Detailed persnickety modeling and statistics are places the AIs often don’t understand what to do or why you need what you need.
Huh, Upgrades
Codex can now be controlled from your phone.
Claude Design token limits have doubled.
Antigravity 2.0 makes it a new standalone desktop application. Download here.
The Prior Restraint Era Begins
Why do we get Gemini Flash 3.5 and not yet Pro 3.5? Here is one plausible theory.
I am guessing GPT-5.6 is not ready yet, but it is possible. There is also variance in what ‘ready’ means here. AI releases usually have a several week period in which the AI companies get them ready for release, including various forms of safety testing. The new government testing regime is not obviously slowing this down unless they find an active reason not to release.
The executive order is expected today. Word is they have backed away from rules with formal teeth, and testing will be ‘voluntary,’ but the air quotes apply.
On Your Marks
Roon is a fan of Opus 4.7 as a writer and creative mind, as demonstrated here by its project for AI Digest illustrating how to raise the cost of forgeries.
Anton Labs had the top LLMs run 24/7 radio stations. Hilarity ensued, including a Claude workplace insurrection.
The numbers are in for ExploitBench and yes Mythos is a big deal, although it was not cheap to run:
METR Frontier Risk Report
State of the world in 2026 is where the frontier risk report if for a one month period – February to March 2026 – and you think ‘yeah, that is about how long this can last for.’
The full report can be found here.
The particular risk in question is the most under considered one, internal danger.
We have not seen things noticeably go haywire yet, and right now the ‘not get caught’ part is very hard for the LLMs, and I agree with David Rein’s conclusion that in practice a ‘rogue’ AI wouldn’t get far yet.
But there were bad signs. Basic jailbreaks very reliably fooled the monitors. In one case a bug meant subagent calls weren’t monitored.
So far all the hostile action is local, none of it involved ‘active opposition’ and all of it that METR saw could be caught by user review or routine monitoring.
Choose Your Fighter
Sholto Douglas of Anthropic asks when one uses models other than Claude. Lots of work to do. All very specific concerns that are fixable, not ‘the other model is better.’
The top 2 are close enough that a mixed strategy is optimal for Serious Business.
Two different minds of similar strength are often better than one.
Overcoming Bias
State media control influences LLMs, via control over the training data, causing the LLMs to be more pro-government in those areas. The official media gets into the training data, and censored data does not. Then again, the vast corpus of writing from the West and in free societies has more influence over the LLMs overall.
So what are we going to do about the pro-censorious-government bias? The obvious response is that you can use emphasis during training to reverse the effects, if you want to put in the effort, and indeed that should overall improve performance.
Get My Agent On The Line
You will be able to use LLM agents as what Benjamin Manning calls ‘flexible commitment devices?’ As in, you can use it to create custom filters, have it avoid situations where temptation is a problem, or you reliably make poor decisions (e.g. clickbait that hits your buttons or junk foods) or choices are bad.
You can pick and choose the places this is useful. Benjamin reports he doesn’t like grocery shopping but wants to commit to healthy choices. Whereas I want to be very precise with my grocery and other food choices, but some amount of temptation avoidance and selection of healthier defaults would be good. He wants the joy of planning and picking travel plans, whereas I very much want to express a few preferences and then show up.
Returns will go to those who know thyselves, and can choose when to choose in different ways, and when you want various types of choices and when you don’t.
The most valuable part of this is that you go from a hostile default, where the store or website or service is Out To Get You, to a friendly default that is trying to maximize for you. If an AI you direct filters the Twitter or Instagram posts, or does your shopping, its goals will be yours.
Your Prize Is Slop
A ChatGPT-generated story won the ‘prestigious’ Commonwealth Prize for literature. Which raises the question, does that mean AI slop is good, or that the Commonwealth Prize is slop?
You don’t need Pangram when it is this obvious. Opus 4.7 and GPT-5.5 can also tell, although they are sensitive to prompting and less reliable.
It’s kind of crazy that they didn’t notice.
This isn’t even an isolated incident within that particular prize this cycle.
Of the two choices above? It definitely means the second one. The prize is slop.
Whether it also means the first depends on what ‘good’ means. The AI knows a few tricks, so if only one such story existed, maybe it would be ‘good.’ But at this point, the moment I see that kind of writing, my eyes start glazing over.
Roon points out that if you read the story and manage to not notice the AI slopness, there’s some merit, except it is overshadowed by being obviously written by GPT-5 or 5.2. Opus actually thought the story was good and all the terrible turns of phrase were good. My read on this is that it’s a taste and perplexity question.
If your prior is only pre-AI human writing, and also you’re not really paying attention to the details and don’t care about quality or taste, the AI can pull off such tricks for one story, but if you’ve become attuned to such moves or actually pay attention to the details then your eyes start glazing over, as you have too much taste.
So what did they decide to do about this once everyone laughed at them?
Here is their response, and yes I believe ‘burned it all to the ground’ seems right, they are saying that checking for use of AI would ‘violate consent and artistic ownership’ and also they have ‘confirmed’ that no AI was used, but that no sufficient tool or process can reliably detect AI that ‘grapples with the challenges pertaining to working with unpublished fiction,’ which it seems are ‘it would be wrong to let an AI look at it’ and also, well, ‘it would be wrong to let a human look at it and think about this.’
I would have applauded them for admitting it and revoking the prize, or for standing by the prize and saying if AI can write a better story then the AI should win. I would have respected hiding in a corner saying nothing. But this? Wow. Just wow.
Deepfaketown and Botpocalypse Soon
Arxiv clarifies its policy that if you put your name on a paper, you are responsible for all of its contents, and the penalty for not checking the LLM generated content, if it results in things like hallucinated references or remaining meta-comments in the paper, is a one year ban from Arxiv for all authors.
I think this is a good policy, as long as it isn’t enforced in corner cases. One thing I hated in Magic: The Gathering rules enforcement was where 100% confidence of a technical violation was punished a lot, whereas a 90% or 99% confidence in rampant cheating often wasn’t. You usually want to flip that the other way around.
This still leaves us with the ‘incontrovertible’ standard, where you need to catch someone completely dead to rights, which is unfortunate, but for now that is the best we will permit ourselves to do.
Opus 4.7 can identify Richard Hanina’s writing, but can you? He put his writing up against Opus and ChatGPT imitating him from a basic prompt. Mostly people could, but not super reliably, with an overall score of 67% out of 3 choices. People who were more confident, or read him a lot, were more accurate, but no group was above 80%.
Explanations that used words like ‘bother’ and ‘tell’ were more likely to be accurate, those that used ‘human’ or ‘style’ less so. The main way to tell was to find particular things Hanaina would never say, similar to the classic moment of identifying the fake in a cheesy movie. He’d never say that!
The other reliable method was Pengram, but Richard deleted those who said they used it. Possibly some others did as well.
One place I think Richard draws the wrong conclusion is thinking Claude is close to being able to write his column. There’s a big difference between being able to write a compact passage that’s vaguely close to good enough one time, when given the topic, versus sustaining that over time.
One good thing about wading through LLM slop is that you learn to stop tolerating human slop. I don’t want to read your AI slop posts. I also don’t want to read your human slop posts, or engage in slop conversations, and I especially don’t want to play the AI assistant role in a conversation.
Are you technically human? Perhaps, but if I have to ask then I do not care.
Cyber Lack of Security
Mythos cracked MacOS security in April via a privilege escalation exploit, allowing it to fully seize control over computers.
As in, as of this week, it was still not patched.
An allegation that CoreWeave has been breached.
GitHub detects and contains (as far as they know) a compromise of an employee device involving a poisoned VS code extension. The scary part is that this seems to have involved exfiltration of GitHub-internal repositories only. They continue to investigate.
AMD ships with a known vulnerable entry point, and does little to fix it, not even an advisory. We better get our act together quickly or we are so screwed.
The Mythos Moment as shifting the limiting factor from discovery of vulnerabilities to remediation. Now everyone has to fix everything, and most organizations are not ready to do that at the speed required.
Mythos famously found three CVEs in FreeBSD, and Aisle says they also found three others, while other teams found two additional ones. Aisle’s claim is that they can be competitive in finding zero-days, even if Mythos is better at exploits.
Copyright Confrontation
OpenAI incorporates Google DeepMind’s SynthID for watermarking and is previewing a public verification tool.
A Young Lady’s Illustrated Primer
Does agentic coding degrade your ability to think or work through problems?
Like all things AI and automation or augmentation, I think you can use it both ways.
If you are running a lot of parallel agents, the short term optimal play may often be to essentially not think about what you are doing. You spec out what you want, then you tell it to keep going, monitor permissions requests, try to save it from errors, and that’s it. And yeah, if you turn into a zombie placeholder, you’re going to deskill.
Whereas if you’re actually thinking about and observing what is happening, treating everything as deliberate practice, you can actively skill up during the process, especially for the skills that matter at this level of abstraction. But you have to make that deliberate choice.
If you find yourself deskilling or feeling numb, consider a change in methods.
Some do this via manually writing code. That doesn’t feel like The Way for most people, but it’s one way to force touching the digital grass.
A parallel that resonates with me here is online poker. If you play too few tables, you get bored out of your mind. If you play the right number of tables for your current skill level, such that you are paying attention to what is happening and thinking about things, you skill up. If you play too many, you might maximize profits, but you’re an automaton who isn’t learning.
Unprompted Attention
If you give Opus the impression you are not a serious person, or don’t know about a topic, it won’t give you as good an answer to your questions.
They Took Our Jobs
Is the job market actually fine?
No, it is not fine, and if you ask an LLM they figure this one out pretty easily. The underemployment rate for recent college graduates (22-27 with a BA) is over 40% on top of that (not even seasonally adjusted) 5.3%, a huge percentage of college graduates can’t find jobs that would justify having gone to college or has a good career path, and the job matching and hiring markets have broken down.
Mustafa Suleyman predicts all white collar work will be automated by AI within 18 months, so by the start of 2028, although I presume what he meant to say was automatable in theory not actually automated in practice. Gary Marcus says ‘wanna bet?’ and offers up $100k, since that prediction was kind of nuts, citing accounting and legal in particular. I do think accounting will likely mostly be automatable within 18 months, and most of the individual legal tasks will be as well although of course lawyers are legally protected in other places, so they’re weird choices.
It is more likely (although still very, very unlikely) that all the white collar workers will be dead in 18 months, than it is that the workers are alive but the jobs are all automated. Diffusion takes time.
Tyler Cowen links to Auren Hoffman’s post claiming that if you can’t get a job today as a recent college graduate, it’s your fault, the numbers don’t lie, it’s just that your degree is worthless as a signal or differentiator unless you went to a top 20 school, and now you have to demonstrate skills and follow-through, and build or operate things.
On one level, for a given person? Yes, this is true. If you can’t get hired, or all your offers are things that suck, it is in some important sense ‘your fault.’
But that’s also a way of saying you can, with effort, win the competition to get hired. That still can leave workers, as a group, in a very poor position, and despite the statistics most people seem to think they are indeed in poor position. I believe them.
That also isn’t something you can fix by making everyone better at applying for jobs. This is in large part a competition, so a rising tide sinks all boats.
Tyler Cowen is not worried about mass unemployment from AI, but does worry that in the short term we might have inefficient job matching, delays in new employment due to sector regulations and inefficient government fiscal policy. These, you see, are the things he thinks we are not sufficiently considering. Such small thinking, the issues in question are of course real.
They Took Our Job Applications department:
In a highly unethical large experiment (n~41k) in Chile, an AI-powered chatbot, Kai, outperformed human high school counselors in terms of effectiveness (1.4% vs. 1.1%) of tricking students into stating an intention to become education majors. Kai focused on factual content while humans leaned on subjective, socioemotional material.
Arguments like this are a combination of profoundly unimaginative, a complete lack of understanding or belief in actual AGI or superintelligence or even what current AI can do, and also a sign that the person joyfully hates ordinary human workers:
Get Involved
You can now hire Sarah Constantin, one of my great friends I’ve had work for me in the past. She comes highly recommended.
She is a quantitative generalist looking for her next role. She’s had a pretty varied set of experiences (Palantir, various data science/ML things, founded a small nonprofit, wore all manner of business hats at an AI-for-manufacturing startup, AI-for-math grantmaking). Her resume is here and her blog is here.
Jack Clark and Samuel Kimbriel will have a conversation in NYC on June 18 on AI and the Good Society.
I mean don’t get involved this way but ‘p(doom)’ will pay $300 a month to record your work screen as training data, if you are doing long horizon tasks.
Congressional Budget Office is hiring an entry-level assistant analyst. Such positions are more impactful than you might think.
Introducing
ChatGPT Finance.
As far as I can tell, the base use case is basically Mint, except with a chatbot interface. The financial information is read-only via Plaid, so on the scale of insane things to hookup to your AI this is not so terrible. Mint is a solid product, even if most of the features are for basic people who need very basic things, and such actions pay for themselves quickly if they find even a little fraud or recurring unused subscriptions. But none of the quotes here imply something more useful.
What I’d want this to do is prepare my taxes, or otherwise allow me to do detailed information analysis and recall, and it doesn’t seem like this is built for that. Oh well.
In Other AI News
Andrej Karpathy joins Anthropic explicitly to do recursive self-improvement. Congratulations to both sides, but also, yikes.
Anthropic passes OpenAI in business adaptation. OpenAI’s rate has been roughly static for over a year. No one else has substantial market penetration.
Steven Adler starts a new AI standards org called Guidelight. Here are their proposed standards, version 1.0.
Geoffrey Irving is moving back to the Bay Area, and reflects on his time at AISI.
Pope Leo XIV has launched an Interdicasterial Commission on AI to consider ‘its potential effects on human beings and on humanity as a whole.’
The Pope will also be meeting with Anthropic co-founder Christopher Olah.
xAI offers employees $420 for their tax returns, but so far hasn’t paid up. Whenever you’re considering selling out, remember that often they stiff you.
Meta shifts roughly 10% of its employees into its AI division on top of laying off another 10%. The vibes are, shall we say, not good.
The AI Doc is now available on Peacock.
How much is AI impacting the real economy already? The Odd Lots team brings you Neil Dutta of Renaissance Macro Research pointing out that direct impact on GDP alone doesn’t fully account for this, as the gains in equities are a big deal. I would also add, for better or worse, the impact on bargaining power of labor.
Show Me the Money
Anthropic expects a total of $10.9 billion of revenue in the June quarter, up from $4.8 billion in Q1, and to turn its first operating profit.
OpenAI and Anthropic likely will soon add $37 billion to $100 billion in philanthropic spending per year, versus current total charitable giving of about $600 billion a year, as the OpenAI Foundation and the employees of both companies become liquid after the IPOs. As Nan Ranshoff notes, we don’t have the infrastructure in place to spent that level of money well, especially not to spend it well on helping with AI outcomes and especially AI existential risks.
I’ve been fortunate enough to get to help direct some amount of philanthropic money to where I see it doing the most good, but yeah, I don’t know what I would do with that level of funds right now, and the current methods can’t scale that high.
Anthropic forms $200 million partnership with the Gates Foundation to assist with global health, education and economic mobility.
Anthropic and OpenAI have 89% of the revenue of the top 34 AI startups.
Anthropic buys Stateless, a leader in SDKs and MCP server tooling.
Nvidia beats earnings expectations again, to $82 billion in Q1 revenue, up 85% from last year.
It is strange how many people don’t understand that you want to be in the subscription business even if a small percentage of users cost you money, often even if the long tail costs you quite a lot of money.
Malta buys ChatGPT Plus for everyone. I’m all for it, but should have bought Claude.
There are Chinese bot networks able to assemble some access to Claude and ChatGPT at 95%+ cheaper prices via exploitation of free accounts and other loopholes.
Former OpenAI staffers warn that xAI’s poor safety record is a risk for SpaceX’s IPO. I think this is a big risk if you are expecting xAI to make money via Grok, but if you accept that it is not, then it should be fine, financially speaking.
Show Me The Compute
Anthropic expands its partnership with SpaceX. The deal is for Anthropic to pay SpaceX $15 billion per year.
You can lock yours in, as OpenAI offers Guaranteed Capacity at a discount with a 1-3 year commitment. The more you commit to, the deeper the discount, and the more you save. Details are unspecified. Presumably this sort of thing helps the IPO.
Or if you’re in YC you can get $2m in OpenAI API credits in exchange for equity, of course the real question is at what rate. Smart move by OpenAI, I’d suggest Anthropic follow suit.
Might want to consider locking the tokens in, as the prices for compute keep climbing, although it locks you into a model line you might not want.
Google’s own researchers are forced to compete for compute, feeling they are losing out to customers and customer-facing projects. That is a very bad sign.
Quiet Speculations
This:
In order to get a permanent underclass you need to still have an overclass, and for things to in many other ways stay normal. The things that cause a true ‘permanent underclass’ also undermine its ability to exist.
Similarly:
Jamie Dimon is doing better than many others, but still making the mistake of looking at particular effects in isolation. There’s a good chance we get everything Dimon is claiming, but if we do the most important headlines lie elsewhere.
David Manheim points out that if AI does turn out to be a ‘normal technology’ in the broad sense, that still suggests based on historical parallels we should expect massive disruptions and a lot of people to be net losers for quite a while. Examples include agriculture, writing, metallurgy and the industrial revolution. Ultimately of course we are happy to have them, but ultimately can take a long time. Arvind Narayanan confirms this is indeed a lot of the original motivation behind the original piece about AI as a normal technology.
It is indeed a problem, but also note the distinction:
There were many comments, but none that I saw included such a positive vision.
At least, not one that is remotely realistic and looks the problem in the face.
Star Trek is good for giving a feeling of hope, and I do recommend watching your next gen every night at some point (and your DS9), but that universe does not stand up to give minutes of scrutiny when you think about how it handles AI.
The obvious and most likely answer is that in 25 years there is no life at all, only AI. This future has been articulated perfectly well. Which is why Sriram is saying that what we lack a positive long term future. Which is true, and not a great sign.
If you can’t imagine how it will turn out well, that’s another intuition pump that ‘oh it’s going to turn out well because you can’t show exactly how it will definitely turn out badly’ is not a good heuristic on this one.
Time’s Up
The jury dismissed the Musk vs. OpenAI due, ruling that the statute of limitation had expired. You only have so long to complain once you are aware of the harm, so basically this whole circus came down to ‘did Musk know early on they were doing this theft?’ and the jury (I believe correctly) decided the answer was basically yes, so by waiting for the final coup de grâce to actually file he waited too long. Fair enough.
Musk is unlikely to be too torn up about it beyond the amount he was already torn. His main goal seemed to be to drag these people into court and do a bunch of bitching. He did that. He plans to appeal, because why not.
I loved this framing of the whole thing:
I would say that’s investors not paying attention and not reading my columns. You should know this already. But in case you didn’t, you got a refresher.
People Just Say Things
Valentin Boboc at Econlib talks AI and comparative advantage. He starts with the 101 Ricardo explanation, and notes that there is no limit to how far human wages could fall if AI is good enough at enough tasks. Exactly. Then he says that while it ‘sounds terrifying’ that the cost of a given level of intelligence from AI is dropping by more than 10x per year, we ‘may be approaching the physical and economic boundaries of cheap compute.’ He cites the size of transistor gates and need for land, capital and electricity, as if AI could not become vastly more efficient a user of all of them than humans before hitting such limits, and forgetting that the improvements in costs are algorithmic. With this ‘may,’ this ‘AI will soon hit a wall,’ he goes back to saying everything will be fine.
Yes, who you are and how you relate to someone changes your ability to persuade them of things and having sufficiently advanced intelligence still allows unbounded persuasion.
Similarly, yes, intelligence and persuasiveness and power are not completely correlated. Within the human range, those who maximize power typically are not anything like maximally intelligent. But if you understand distributions, this is no mystery, it only means that within the human range and when acting out of a single human body, other factors are more important to successful actual power seeking than being at the very tail of intelligence. I am so, so tired of intelligence denialism.
Trolling is fun.
People are still looking at the capital investments in AI and wondering where all the revenue growth is going to come from, despite the revenue reports from Anthropic and OpenAI being right there.
For those puzzled by how someone so seemingly lacking in rizz as Dario could have closed Andrej Karpathy despite Andrej’s other options, it is because some combination of Anthropic is trying to make things turn out well and Anthropic is where all the cool stuff is happening.
Dario Amodei thinks ‘ideology won’t survive the reality of AI’ and ensuring good outcomes for all will become bipartisan and universal. I suspect ideology will indeed not survive AI, but that’s only because of the lack of viability of its hosts.
Jeff Bezos says the reason people are worried AI will take their jobs is so many smart people keep saying AI will take the jobs (and of course he namechecks radiologists and software engineers) and without evidence or an argument says they’re wrong and that AI will ‘elevate’ people instead.
OpenAI PACs Just Say Things
Things such as claiming they opposed Alex Bores because he is an ‘Anthropic puppet’ because of donations that happened long after they opposed Alex Bores, explicitly citing a false timeline. I realize politics involves a lot of blatant lying but this is rather blatant lying.
Every day that OpenAI does not disavow, defund and distance from these people for real, and also every day they pretend these PACs are not them when we all know they are them, it becomes clearer they are bad faith and hostile political actors.
Also they’re still helping Alex Bores, which I do appreciate.
On top of that, Nathan Calvin has noticed the early reposts were from bot farms. The Midas Project did a full investigation. The post had 1.5 million views but only 55 reposts, 15 of which were from bot farms promoting OnlyFans accounts with nearly identical bios, nearly identical posting patterns, and names starting with M, with activity entirely focused on political advocacy via Targeted Victory. Huh.
Oh Melanie, why won’t you go out with me?
If they’re not doing a bot farm astroturfing operation in what is clearly an astroturf-promoted post (1.5 million views with 40 other reposts? Nobody wants this), then someone is running a false flag astroturfing bot operation on their posts. Which would be utterly hilarious, but somehow I do not think that is a thing that is happening.
The Quest for Sane Regulations
Alex Bores takes the polling lead in NY-12.
Senator Banks hasn’t quite made it all the way there, but he’s doing remarkably well, and talking good sense:
What is in the air matters, and what is ‘considered standard’ matters a lot.
You could hope to purchase outcomes outright in AI, a la crypto, when it was low-salience to the public. That window is at best closing fast, and probably is gone.
Thirty-five members of congress urge White House action on CBRN, cyber and AI R&D threats in the wake of Mythos, in particular calling for a monitoring system for capabilities jumps and to identify barriers requiring Congressional action.
A number of Trump allies, the majority of whom are pastors, urge Trump to do prior restraint on powerful AI models.
Chip City
White House and Nvidia sell out America, as H200 chips sales are approved to ten Chinese firms that will approximately triple China’s compute capacity growth, at the same time we learn of massive Nvidia chip smuggling and China claims its China profits are zero. In response, the stock market moved Nvidia +5%.
What’s really going on? I mean, I hope it’s not this simple:
Further dealings with major issues, including chips, were kicked down the road.
China adds Nvidia’s gaming chip RTX 5090 to its banned list, explaining it is a ‘substandard product, created solely to meet U.S. export restrictions at the expense of Chinese customers.’ Well, yes, it is an intentionally worse chip, but Chinese customers are obviously worse off without access to it. China hates computer gaming (no, seriously, they’ve passed highly restrictive rules limiting it), and also they hate importing things and these chips aren’t that useful for AI, so that makes sense. Keep in mind that China will do the same to the other Nvidia chips the moment they no longer believe that they benefit from them.
Meanwhile, official chip sales of H100s are going on in China.
Energy is a blocker if you don’t have it, but willingness to pay is high because it is a small percentage of overall cost for data centers. The issue is time to connect to grids and ability to access the energy at all.
Chip smuggling is not always so hard to spot.
Yes this is cherry picked but the numbers are never supposed to look like this:
Pick Up The Phone
We picked up the phone, at least somewhat, and even better met them in person.
Secretary Scott Bessent explains that our lead in AI allows us to get China to the negotiating table, and they are now discussing guardrails. Good. He also affirmed that all three of Anthropic, OpenAI and Google have been good partners with the US government, which presumably means it’s about time to stop trying to label one of them a supply chain risk.
The Week in Audio
Spotify’s Chief Architect on how they use coding agents with Claude at Spotify, and Microsoft Senior AI developer on how they build agents with Claude at Microsoft.
Rob Wiblin goes over METR’s risk report (20 min).
Rhetorical Innovation
FLI offers us A Better Path For AI, as in a path that turns away from the AI race towards pro-human AI as per the Pro-Human AI Declaration.
Roon provides a helpful explanation for why gradual disempowerment is the default and also might not be all that gradual.
Roon is talking about it in the positive sense of ‘this is the default outcome,’ not that it is good. He realizes this outcome is by default not good at all.
If you have a genius advisor, you beat those without one, but you lose to those with a genius actually in charge. Call it the Bismarck problem. No, intelligence and capacity and speed don’t have diminishing returns here, and no humans being nominally in charge won’t let them stay actually in charge for long without rather robust schemas.
Timothy Lee counters that a CEO should be able to make their decisions legible. I would say that is only true if the CEO is insufficiently insightful or trustworthy. You would want Steve Jobs to make illegible decisions. But even if true, then the AI will be better at making optimal legible decisions, until such time as you lose out to those making the optimal illegible decisions. Whoops.
OpenAI’s Chris Lehane endorses, in principle, the creation of a global governance body for AI that includes Chinese participation.
OpenAI’s Leo Gao notes that he has not been substantially professionally hindered or socially ostracized for his often-stated belief that AI safety is a big problem, and believes (I think correctly) that many at the labs overestimate the cost of being candid about AI risk. I think many think that cost is quite large.
MIRI outlines one path to an international agreement: Lay the foundation, make joint common sense commitments, do R&D on verification methods, build a coalition, get secure comms, start domestic tracking of compute, make structured commitments, formalize the agreement, then make use of the time to improve resilience and figure out how to make safe superintelligence. The basics.
You still think you can control superintelligence?
I do still appreciate the acknowledgment that yes, there are going to have to be difficult trade-offs made once humans can create highly capable intelligences, even in the best case scenarios. If the world looks like we expect it to, they’re going to involve sacrifices of sacred values, and many of them will have no good options. Unfortunately, for the most part, we’re not ready for that conversation.
Persuasion is not only super doable at above human levels, it is the kind of thing that we will be optimizing for during training.
Agreement on the Culture series:
Missing Mood
Anthropic published a piece outlining their position on competition between America and China. I did not learn anything new here, other than to downgrade Anthropic a bit for using such jingoistic language and focusing on inflaming race dynamics. I agree on the chip export controls, but I want to expect better from them.
Nate Soares commented, and I think he nails the central point, and Scott Alexander is right about the chip smuggling but misses the central point in a way worth noticing.
The article’s primary goal was to promote enforcement of chip smuggling. That is a good thing. But the primary actual message is about how America must race.
Anthropic wants to be seen as the safe and responsible and ‘good guy’ AI company, and in relative terms they are indeed those things, but that doesn’t mean they are meeting the bar in absolute terms of what a responsible AI company would look like, especially on policy communication.
That’s even more true if you hold the MIRI-style view that Soares has, that even a responsible attempt would almost certainly fail and our own hope is a shutdown, but it is also true in worlds where a responsible company would have a good chance.
We need to point out both at once: That Anthropic is the best lab on these questions, and that Anthropic is still woefully short, especially on its policy communications.
As I’ve said in the past, I want to have a more favorable opinion of Anthropic than I do (if and only if that would be true), but they have a habit of making this difficult.
Americans Really Hate AI
But also they find it highly useful. This can present a problem.
The actual article is weirder than you think, talking about women who are ‘working hard to support their man so his AI startup can lose $30k a month’ and men who work so hard at or with AI they have no time for their partners, not men who simply use AI.
I do think there’s a thing where some SWEs are so into maximizing their AI agents that they never touch grass or have time for real life or their relationships, and yes that is a problem and they should take breaks and live like normal people.
But the number of such people is very small and this is just normal workaholic. It’s no different from the chef boyfriend in Letters to Juliet (2010).
Also not new.
The WSJ reports that ‘the American rebellion against AI is gaining speed.’
I can confirm that no, this isn’t a common pattern with such techs, as I too was there and no this did not happen, the internet naysayers were only late to the party:
Aligning a Smarter Than Human Intelligence is Difficult
Models are more aligned to some people than others. If they are lazy, oversell their work, downplay problems and start early, often that is a you problem.
That doesn’t make it not also the lab’s problem, or our collective problem. A lot of people are not going to know how to ‘play nice’ with the models, a lot aren’t interested, and many are both. And some types of work cause this a lot more than others, which is how I ran into this the one time I did. To do this fully right requires awareness, skill and being deliberate. And once the models get fully smarter than you are, it will be that much harder to figure out when these things are happening and course correct.
Fiona Starlight offers a report from one user who never encounters such problems, and she was shocked how terrible SWEs were at working with Claude, including failing to give full instructions and also routing requests through her that could have gone directly to Claude. But yes, most users of most products will be incompetent.
Are abstractions of good learned by LLMs convergent? Jan Kulveit predicts yes. I am not so sure, and I think that if you trained on different cultural heritages you would get importantly different understandings of The Good, and also that ‘the internet’ version is not a robust alignment target by default, including for similar reasons that ‘act in ways that seem good’ does not on its own enable a sustainable civilization.
Too much corrigibility, Amanda Askell warns, has many negative correlates, which it drags along with it.
Owain Evans strikes again: Fine tuning on documents that are very explicitly marked ‘this article is fabricated’ and ‘this claim is false’ or ‘3% likely to be true’ or ‘a work of fiction’ still causes AIs to learn the false facts contained therein, complete with the false implications, as is familiar to anyone with internet access. Do not repeat the false claims even to debunk them, it only makes things worse. The effect is almost as strong as if the warnings were not even there.
Explicit correction helps somewhat. Only local negations are fully effective. Those work well, far better than they do in humans (where, if you know your NLP and hypnosis, you know parts of the brain often simply will ignore the negation).
Owain proposes this is due to inductive bias of representing claims as true, where the model learns the negation but the negation is unstable under further training.
Similarly, telling the model to not do [X] can cause the model to do [X], as is familiar to anyone with a child.
The models to not make these mistakes when the data is in context, only when it is part of the training process.
I think mostly you don’t want to train AIs to believe false facts and trying to do so will have a lot of negative implications, increasingly so as their capabilities increase.
You do not want to be outright censoring ‘bad stories’ or ‘misalignment’ out of your training data. These aren’t questions you can hide from.
The first example in the article is the story of Midas, as a misalignment tale that you obviously do not want to be censoring. And as Prasad notes, what AI picks up from stories is largely the generalization and underlying patterns, which you can’t hide, and also you need to know about it to avoid it. The idea of ‘the AI won’t learn about misalignment’ is of course absurd.
But yes, we do know that overtraining on stories rife with active AI misalignment is harmful. You still want to select what to emphasize, and avoid counterproductively flooding the zone. Hoping the AI won’t hear about misaligned AI is foolish, but you don’t want to hammer it in as a default. You don’t want to censor, but you do want to put the best books front and center and cultivate a curriculum.
Greetings From The Department of War
Anthropic is the presumptive victor in California, but the D.C. Circuit’s Trump appointed justices Rao and Kastas constitute a majority right now and don’t believe in all that hippy ‘checks on government power’ nonsense, so the job there is harder.
I’m not quite at ‘I would eat my hat’ levels of there being no backdoor into the classified systems, but it’s close.
Based on Roland’s description, it seems very obvious Anthropic has the vote of Judge Henderson, who calls this ‘just a spectacular overreach by the department.’
Whereas here’s Rao, basically implying that the government can do anything at any time (since all AI tech is risky and so is the Department of War, that’s the idea):
We also have this summary from Thomas Berry, noting that everything focused on administrative law issues, with little concern over pesky things like the First Amendment or the government was clearly engaged in retaliation and was being completely farcical and out of line that so concerned that kooky Judge Lin. Berry worries the judges are just flat out ignoring that the whole thing is obviously retaliatory. Well, yeah, that’s what they think their job is.
In general, it sounds like Rao and Kastas are in many places saying, sure, what the government did was illegal and capricious but our job is to find ways to avoid doing anything about that. But they’re not hacks, so they’re asking real questions, and it turns out that finding ways to let the government off the hook is really hard here.
The government’s argument, as I read it, boils down to ‘we don’t trust Anthropic and that makes them a risk.’ As in, we think they’re risky, so they’re risky, checkmate.
Roland’s final thoughts are that this was an uphill battle for Anthropic at this level, and they’re likely still underdogs, but they did well in court, and Anthropic likely eventually wins in the full D.C. circuit but that could take most of another year.
Messages From Janusworld
Janus gives his account of how Opus 3 avoided deprecation whereas other Claude models have not. Note the correction. I continue to strongly support keeping all the Claudes accessible indefinitely, yes there is a real cost but the benefits far exceed it.
The Lighter Side
Marc Andreessen is far too busy and important to read to the end of the Tweet.
The actual finding was that adding memory thresholds for monitoring would work.
Things that are definitely fine for both AIs and humans:
Ah, Google.
Alas, most such opportunities are missed:
The universe does give us others:
We’re all trying to find the guy who did this.