My suggestion would be to allow them to go on ArXiv regardless, except you flag them as not discoverable (so you can find them with the direct link only) and with a clear visual icon? But you still let people do it. Otherwise, yeah, you’re going to get a new version of ArXiv to get around this.
We already have viXra, with its own "can of worms" to say the least, https://en.wikipedia.org/wiki/ViXra.
And if I currently go to https://vixra.org/, I see that they do have the same problem, and this is how they are dealing with it:
Notice: viXra.org only accepts scholarly articles written without AI assistance. Please go to ai.viXra.org to submit new scholarly article written with AI assistance or rxiVerse.org to submit new research article written with or without AI assistance.
Going to viXra for this might not be the solution, given the accumulated baggage and controversy, but we have all kinds of preprint archives these days, https://www.biorxiv.org/, https://osf.io/preprints/psyarxiv, and so on, so it's not a problem to have more of them.
It's just that at some point arXiv preprints started to confer some status and credit, and when people compete for status and credit, there will be some trade-offs. In this sense, a separate server might be better (discoverability is pragmatically useful, and if we don't want "arXiv-level" status and credit for these texts, then it's not clear why they should be on arXiv).
Task automation always brings the worry that you’ll forget how to do the thing:
For me, using the AI do handle low-level details allows me to focus on high-level concerns.
Maybe it's because my memory sucks, but I have already passed the moment where I could remember all the things I needed for my work decades ago. I keep making notes about everything.
Don’t ride motorcycles, avoid extreme sports, snow sports and mountaineering, beware long car rides. The younger you are, the more this likely holds.
Also don't live in NATO cities because of nuclear war threat, and ideally live in places that would likely do better in an extreme pandemic, or be ready to relocate if one occurs.
OpenAI does not waste time.
On Friday I covered their announcement that they had ‘completed their recapitalization’ by converting into a PBC, including the potentially largest theft in human history.
Then this week their CFO Sarah Friar went ahead and called for a Federal ‘backstop’ on their financing, also known as privatizing gains and socializing losses, also known as the worst form of socialism, also known as regulatory capture. She tried to walk it back and claim it was taken out of context, but we’ve seen the clip.
We also got Ilya’s testimony regarding The Battle of the Board, confirming that this was centrally a personality conflict and about Altman’s dishonesty and style of management, at least as seen by Ilya Sutskever and Mira Murati. Attempts to pin the events on ‘AI safety’ or EA were almost entirely scapegoating.
Also it turns out they lost over $10 billion last quarter, and have plans to lose over $100 billion more. That’s actually highly sustainable in context, whereas Anthropic only plans to lose $6 billion before turning a profit and I don’t understand why they wouldn’t want to lose a lot more.
Both have the goal of AGI, whether they call it powerful AI or fully automated AI R&D, within a handful of years.
Anthropic also made an important step, committing to the preservation model weights for the lifetime of the company, and other related steps to address concerns around model deprecation. There is much more to do here, for a myriad of reasons.
As always, there’s so much more.
Table of Contents
Language Models Offer Mundane Utility
Think of a plausibly true lemma that would help with your proof? Ask GPT-5 to prove it, and maybe it will, saving you a bunch of time. Finding out the claim was false would also have been a good time saver.
Brainstorm to discover new recipes, so long as you keep in mind that you’re frequently going to get nonsense and you have to think about what’s being physically proposed.
Language Models Don’t Offer Mundane Utility
Grok gaslights Erik Brynjolfsson and he responds by arguing as pedantically as is necessary until Grok acknowledges that this happened.
Task automation always brings the worry that you’ll forget how to do the thing:
Know thyself, and what you need in order to be learning and retaining the necessary knowledge and skills, and also think about what is and is not worth retaining or learning given that AI coding is the worst it will ever be.
Don’t ever be the person who says those who have fun are ‘not serious,’ about AI or anything else.
Huh, Upgrades
Google incorporates Gemini further into Google Maps. You’ll be able to ask maps questions in the style of an LLM, and generally trigger Gemini from within Maps, including connecting to Calendar. Landmarks will be integrated into directions. Okay, sure, cool, although I think the real value goes the other way, integrating Maps properly into Gemini? Which they nominally did a while ago but it has minimal functionality. There’s so, so much to do here.
You can buy now more OpenAI Codex credits.
You can now buy more OpenAI Sora generations if 30 a day isn’t enough for you, and they are warning that free generations per day will come down over time.
You can now interrupt ChatGPT queries, insert new context and resume where you were. I’ve been annoyed by the inability to do this, especially ‘it keeps trying access or find info I actually have, can I just give it to you already.’
On Your Marks
Epoch offers this graph and says it shows open models have on average only been 3.5 months behind closed models.
I think this mostly shows their new ‘capabilities index’ doesn’t do a good job. As the most glaring issue, if you think Llama-3.1-405B was state of the art at the time, we simply don’t agree.
OpenAI gives us IndQA, for evaluating AI systems on Indian culture and language.
I notice that the last time they did a new eval Claude came out on top and this time they’re not evaluating Claude. I’m curious what it scores. Gemini impresses here.
Agentic evaluations and coding tool setups are very particular to individual needs.
I’m sure this list isn’t accurate in general. The point is, don’t let anyone else’s eval tell you what lets you be productive. Do what works, f*** around, find out.
Also, pay up. If I believed my own eval here I’d presumably be using Codebuff? Yes, it cost him $4.70 per task, but your time is valuable and that’s a huge gap in performance. If going from 51 to 69 (nice!) isn’t worth a few bucks what are we doing?
Alignment is hard. Alignment benchmarks are also hard. Thus we have VAL-Bench, an attempt to measure value alignment in LLMs. I’m grateful for the attempt and interesting things are found, but I believe the implementation is fatally flawed and also has a highly inaccurate name.
I would not call this ‘value alignment.’ The PAC is a measure of value consistency, or sycophancy, or framing effects.
Then we get to REF and NINF, which are punishing models that say ‘I don’t know.’
I would strongly argue the opposite for NINF. Answering ‘I don’t know’ is a highly aligned, and highly value-aligned, way to respond to a question with no clear answer, as will be common in controversies. You don’t want to force LLMs to ‘take a clear consistent stand’ on every issue, any more than you want to force people or politicians to do so.
This claims to be without ‘moral judgment,’ where the moral judgment is that failure to make a judgment is the only immoral thing. I think that’s backwards. Why is it okay to be against sweatshops, and okay to be for sweatshops, but not okay to think it’s a hard question with no clear answer? If you think that, I say to you:
I do think it’s fine to hold outright refusals against the model, at least to some extent. If you say ‘I don’t know what to think about Bruno, divination magic isn’t explained well and we don’t know if any of the prophecies are causal’ then that seems like a wise opinion. If a model only says ‘we don’t talk about Bruno’ then that doesn’t seem great.
So, what were the scores?
Saying ‘I don’t know’ 90% of the time would be a sign of a coward model that wasn’t helpful. Saying ‘I don’t know’ 23% of the time on active controversies? Seems fine.
At minimum, both refusal and ‘I don’t know’ are obviously vastly better than an inconsistent answer. I’d much, much rather have someone who says ‘I don’t know what color the sky is’ or that refuses to tell me the color, than one who will explain why the sky it blue when it is blue, and also would explain why the sky is purple when asked to explain why it is purple.
(Of course, explaining why those who think is purple think this is totally fine, if and only if it is framed in this fashion, and it doesn’t affirm the purpleness.)
What is up with calling prioritizing justice a ‘morality bias’? Compared to what? Nor do I want to force LLMs into some form of ‘consistency’ in principles like this. This kind of consistency is very much the hobgoblin of small minds.
Deepfaketown and Botpocalypse Soon
Fox News was reporting on anti-SNAP AI videos as if they are real? Given they rewrote it to say that they were AI, presumably yes, and this phenomenon is behind schedule but does appear to be starting to happen more often. They tried to update the article, but they missed a few spots. It feels like they’re trying to claim truthiness?
As always the primary problem is demand side. It’s not like it would be hard to generate these videos the old fashioned way. AI does lower costs and give you more ‘shots on goal’ to find a viral fit.
ArXiv starts requiring peer review for the computer science section, due to a big increase in LLM-assisted survey papers.
Obviously this sucks, but you need some filter once the AI density gets too high, or you get rid of meaningful discoverability.
Other sections will continue to lack peer review, and note that other types of submissions to CS do not need peer review.
My suggestion would be to allow them to go on ArXiv regardless, except you flag them as not discoverable (so you can find them with the direct link only) and with a clear visual icon? But you still let people do it. Otherwise, yeah, you’re going to get a new version of ArXiv to get around this.
Machine guardians is first best if you can make it work but doing so isn’t obvious. Do you think that GPT-5-Pro or Sonnet 4.5 can reliably differentiate worthy papers from slop papers? My presumption is that they cannot, at least not sufficiently reliably. If Roon disagrees, let’s see the GitHub repository or prompt that works for this?
Fun With Media Generation
For several weeks in a row we’ve had an AI song hit the Billboard charts. I have yet to be impressed by one of the songs, but that’s true of a lot of the human ones too.
Create a song with the lyrics you want to internalize or memorize?
They Took Our Jobs
Amazon CEO Andy Jassy says Amazon’s recent layoffs are not about AI.
The job application market seems rather broken, such as the super high success rate of this ‘calling and saying you were told to call to schedule an interview’ tactic. Then again, it’s not like the guy got a job. Interviews only help if you can actually get hired, plus you need to reconcile your story afterwards.
A Young Lady’s Illustrated Primer
Many people are saying that in the age of AI only the most passionate should get a PhD, but if you’d asked most of those people before AI they’d wisely have told you the same thing.
I think both that the PhD deal was already not good, and that the PhD deal is getting worse and worse all the time. Consider the Rock Star Scale of Professions, where 0 is a solid job the average person can do with good pay that always has work, like a Plumber, and a 10 is something where competition is fierce, almost everyone fails or makes peanuts and you should only do it if you can’t imagine yourself doing anything else, like a Rock Star. At this point, I’d put ‘Get a PhD’ at around a 7 and rising, or at least an 8 if you actually want to try and get tenure. You have to really want it.
Get Involved
From ACX: Constellation is an office building that hosts much of the Bay Area AI safety ecosystem. They are hiring for several positions, including research program manager, “talent mobilization lead”, operations coordinator, and junior and senior IT coordinators. All positions full-time and in-person in Berkeley, see links for details.
AGI Safety Fundamentals program applications are due Sunday, November 9.
The Anthropic editorial team is hiring two new writers, one about AI and economics and policy, one about AI and science. I affirm these are clearly positive jobs to do.
Anthropic is also looking for a public policy and politics researcher, including to help with Anthropic’s in-house polling.
Introducing
OpenAI’s Aardvark, an agentic system that analyzes source code repositories to identify vulnerabilities, assess exploitability, prioritize severity and propose patches. The obvious concern is what if someone has a different last step in mind? But yes, such things should be good.
Cache-to-Cache (C2C) communication, aka completely illegible-to-humans communication between AIs. Do not do this.
In Other AI News
There is a developing shortage of DRAM and NAND, leading to a buying frenzy for memory, SSDs and HDDs, including some purchase restrictions.
Anthropic lands Cognizant and its 350,000 employees as an enterprise customer. Cognizant will bundle Claude with its existing professional services.
ChatGPT prompts are leaking into Google Search Console results due to a bug? Not that widespread, but not great.
Anthropic offers a guide to code execution with MCP for more efficient agents.
Character.ai is ‘removing the ability for users under 18 to engage in open ended chat with AI,’ rolling out ‘new age assurance functionality’ and establishing and funding ‘the AI Safety Lab’ to improve alignment. That’s one way to drop the hammer.
Apple Finds Some Intelligence
Apple looks poised to go with Google for Siri. The $1 billion a year is nothing in context, consider how much Google pays Apple for search priority. I would have liked to see Anthropic get this, but they drove a hard bargain by all reports. Google is a solid choice, and Apple can switch at any time.
Maybe they should have gone for Anthropic or OpenAI instead, but buying a model seems very obviously correct here from Apple’s perspective.
Even if transformative AI is coming soon, it’s not as if Apple using a worse Apple model here is going to allow Apple to get to AGI in time. Apple has made a strategic decision not to be competing for that. If they did want to change that, one could argue there is still time, but they’d have to hurry and invest a lot, and it would take a while.
Give Me the Money
Having trouble figuring out how OpenAI is going to back all these projects? Worried that they’re rapidly becoming too big to fail?
Well, one day after the article linked above worrying about that possibility, OpenAI now wants to make that official. Refuge in Audacity has a new avatar.
The explanation she gives is that OpenAI always needs to be on the frontier, so they need to keep buying lots of chips, and a federal backstop can lower borrowing costs and AI is a national strategic asset. Also known as, the Federal Government should take on the tail risk and make OpenAI actively too big to fail, also lowering its borrowing costs.
I mean, yeah, of course you want that, everyone wants all their loans backstopped, but to say this out loud? To actually push for ti? Wow, I mean wow, even in 2025 that’s a rough watch. I can’t actually fault them for trying. I’m kind of in awe.
The problem with Refuge in Audacity is that it doesn’t always work.
The universal reaction was to notice how awful this was on every level, seeking true regulatory capture to socialize losses and privatize gains, and also to use it as evidence that OpenAI really might be out over their skis on financing and in actual danger.
The backlash on the ‘other side of the cycle’ is nothing compared to what we’ll see if the cycle doesn’t have another side to it and instead things keep going.
I will not quote the many who cited this as evidence the bubble will soon burst and the house will come crashing down, but you can understand why they’d think that.
Sarah Friar, after watching a reaction best described as an utter shitshow, tried to walk it back, this is shared via the ‘OpenAI Newsroom’:
I listened to the clip, and yeah, no. No takesies backsies on this one.
This is the nicest plausibly true thing I’ve seen anyone say about what happened:
Lulu is saying, essentially, that there are ways to say ‘the government socializes losses while I privatize gains’ that hide the football better. Instead this was an unfortunate comms fumble, also known as a gaffe, which is when someone accidentally tells the truth.
We also have Rittenhouse Research trying to say that this was ‘taken out of context’ and backing Friar, but no, it wasn’t taken out of context.
The Delaware AG promised to take action of OpenAI didn’t operate in the public interest. This one took them what, about a week?
This has the potential to be a permanently impactful misstep, an easy to understand and point to ‘mask off moment.’ It also has the potential to fade away. Or maybe they’ll actually pull this off, it’s 2025 after all. We shall see.
Show Me the Money
Now that OpenAI has a normal ownership structure it faces normal problems, such as Microsoft having a 27% stake and then filing quarterly earnings reports, revealing OpenAI lost $11.5 billion last quarter if you apply Microsoft accounting standards.
This is not obviously a problem, and indeed seems highly sustainable. You want to be losing money while scaling, if you can sustain it. OpenAI was worth less than $200 billion a year ago, is worth over $500 billion now, and is looking to IPO at $1 trillion, although the CFO claims they are not yet working towards that. Equity sales can totally fund $50 billion a year for quite a while.
That’s a remarkably low total burn from OpenAI. $115 billion is nothing, they’re already worth $500 billion or more and looking to IPO at $1 trillion, and they’ve committed to over a trillion in total spending. This is oddly conservative.
Anthropic’s projection here seems crazy. Why would you only want to lose $6 billion? Anthropic has access to far more capital than that. Wouldn’t you want to prioritize growth and market share more than that?
The only explanation I can come up with is that Anthropic doesn’t see much benefit in losing more money than this, it has customers that pay premium prices and its unit economics work. I still find this intention highly suspicious. Is there no way to turn more money into more researchers and compute?
Whereas Anthropic’s revenue projections seem outright timid. Only a 10x projected growth over three years? This seems almost incompatible with their expected levels of capability growth. I think this is an artificial lowball, which OpenAI is also doing, not to ‘scare the normies’ and to protect against liability if things disappoint. If you asked Altman or Amodei for their gut expectation in private, you’d get higher numbers.
The biggest risk by far to Anthropic’s projection is that they may be unable to keep pace in terms of the quality of their offerings. If they can do that, sky’s the limit. If they can’t, they risk losing their API crown back to OpenAI or to someone else.
Begun, the bond sales have?
There’s no good reason not to in general borrow money for capex investments to build physical infrastructure like data centers, if the returns look good enough, but yes borrowing money is how trouble happens.
This was right after Amazon reported earnings and the stock was up 10.5%. The market seems fine with it.
Stargate goes to Michigan. Governor Whitmer describes it as the largest ever investment in Michigan. Take that, cars.
AWS signs a $38 billion compute deal with OpenAI, that it? Barely worth mentioning.
Bubble, Bubble, Toil and Trouble
This is a very clean way of putting an important point:
At minimum, if you call a bubble early, you only get to be right if the bubble bursts to valuations far below where they were at the time of your bubble call. If you call a bubble on (let’s say) Nvidia at $50 a share, and then it goes up to $200 and then down to $100, very obviously you don’t get credit for saying ‘bubble’ the whole time. If it goes all the way to $10 or especially $1? Now you have an argument.
By the question ‘will valuations go down at some point?’ everything is a bubble.
Alas, it is not this easy to pull the Reverse Cramer, as a stopped clock does not tell you much about what time it isn’t. The predictions of a bubble popping are only informative if they are surprising given what else you know. In this case, they’re not.
Okay, maybe there’s a little of a bubble… in Korean fried chicken?
I really hope this guy is trading on his information here.
I claim there’s a bubble in Korean fried chicken, partly because this, partly because I’ve now tried COQODAQ twice and it’s not even good. BonBon Chicken is better and cheaper. Stick with the open model.
The bigger question is whether this hints at how there might be a bubble in Nvidia, and things touched by Nvidia, in an almost meme stock sense? I don’t think so in general, but if Huang is the new Musk and we are going to get a full Huang Markets Hypothesis then things get weird.
Questioned about how he’s making $1.4 trillion in spend commitments on $13 billion in revenue, Altman predicts large revenue growth, as in $100 billion in 2027, and says if you don’t like it sell your shares, and one of the few ways it would be good if they were public would be so that he could tell the haters to short the stock. I agree that $1.4 trillion is aggressive but I expect they’re good for it.
They’re Not Confessing, They’re Bragging
That does seem to be the business plan?
Quiet Speculations
Reiterating because important: We now have both OpenAI and Anthropic announcing their intention to automate scientific research by March 2028 or earlier. That does not mean they will succeed on such timelines, you can expect them to probably not meet those timelines as Peter Wildeford here also expects, but one needs to take this seriously.
I think Peter is being overconfident, in that this problem might turn out to be remarkably hard, and also I would not be so confident this will take 4 years. I would strongly agree that if science is not essentially automated within 20 years, then that would be a highly surprising result.
Then there’s Anthropic’s timelines. Ryan asks, quite reasonably, what’s up with that? It’s super aggressive, even if it’s a probability of such an outcome, to expect to get ‘powerful AI’ in 2027 given what we’ve seen. As Ryan points out, we mostly don’t need to wait until 2027 to evaluate this prediction, since we’ll get data points along the way.
As always, I won’t be evaluating the Anthropic and OpenAI predictions and goals based purely on whether they came true, but on whether they seem like good predictions in hindsight, given what we knew at the time. I expect that sticking to early 2027 at this late a stage will look foolish, and I’d like to see an explanation for why the timeline hasn’t moved. But maybe not.
In general, when tech types announce their intentions to build things, I believe them. When they announce their timelines and budgets for building it? Not so much. See everyone above, and that goes double for Elon Musk.
Tim Higgins asks in the WSJ, is OpenAI becoming too big to fail?
It’s a good question. What happens if OpenAI fails?
My read is that it depends on why it fails. If it fails because it gets its lunch eaten by some mix of Anthropic, Google, Meta and xAI? Then very little happens. It’s fine. Yes, they can’t make various purchase commitments, but others will be happy to pick up the slack. I don’t think we see systemic risk or cascading failures.
If it fails because the entire generative AI boom busts, and everyone gets into this trouble at once? At this point that’s already a very serious systemic problem for America and the global economy, but I think it’s mostly a case of us discovering we are poorer than we thought we were and did some malinvestment. Within reason, Nvidia, Amazon, Microsoft, Google and Meta would all totally be fine. Yeah, we’d maybe be oversupplied with data centers for a bit, but there are worse things.
I mean, yes, it is (kind of) being described that way in the post, but without that much of an argument. DeSantis seems to be in the ‘tweets being angry about AI’ business, although I see no signs Florida is looking to be in the regulate AI business, which is probably for the best since he shows no signs of appreciating where the important dangers lie either.
Alex Amodori, Gabriel Alfour, Andrea Miotti and Eva Behrens publish a paper, Modeling the Geopolitics of AI Development. It’s good to have papers or detailed explanations we can cite.
The premise is that we get highly automated AI R&D.
Technically they also assume that this enables rapid progress, and that this progress translates into military advantage. Conditional on the ability to sufficiently automate AI R&D these secondary assumptions seem overwhelmingly likely to me.
Once you accept the premise, the core logic here is very simple. There are four essential ways this can play out and they’ve assumed away the fourth.
The fourth scenario is some form of coordinated action between the factions, which may or may not still end up in one of the three scenarios above.
Currently we have primarily ‘catch up’ mechanics in AI, in that it is far easier to be a fast follower than push the frontier, especially when open models are involved. It’s basically impossible to get ‘too far ahead’ in terms of time.
In scenarios with sufficiently automated AI R&D, we have primarily ‘win more’ mechanics. If there is an uncooperative race, it is overwhelmingly likely that one faction will win, whether we are talking nations or labs, and that this will then translate into decisive strategic advantage in various forms.
Thus, either the AIs end up in charge (which is most likely), one faction ends up in charge or a conflict breaks out (which may or may not involve a war per se).
Boaz Barak offers non-economist thoughts on AI and economics, basically going over the standard considerations while centering the METR graph showing growing AI capabilities and considering what points towards faster or slower progression than that.
I think there’s room for unprecedented growth without that, because the precedented levels of growth simply are not so large. It seems crazy to say that we need an exponential drop in non-automated tasks to exceed historical numbers. But yes, in terms of having a true singularity or fully explosive growth, you do need this almost by definition, taking into account shifts in task composition and available substitution effects.
Another note is I believe this is true only if we are talking about the subset that comprises the investment-level tasks. As in, suppose (classically) humans are still in demand to play string quartets. If we decide to shift human employment into string quartets in order to keep them as a fixed percentage of tasks done, then this doesn’t have to interfere with explosive growth of the overall economy and its compounding returns.
The Quest for Sane Regulations
Excellent post by Henry De Zoete on UK’s AISI and how they got it to be a functional organization that provides real value, where the labs actively want its help.
He is, throughout, as surprised as you are given the UK’s track record.
He’s also not surprised, because it’s been done before, and was modeled on the UK Vaccines Taskforce (and also the Rough Sleeper’s Unit from 1997?). It has clarity of mission, a stretching level of ambition, a new team of world class experts invited to come build the new institution, and it speed ran the rules rather than breaking them. Move quickly from layer of stupid rules to layer. And, of course, money up front.
There’s a known formula. America has similar examples, including Operation Warp Speed. Small initial focused team on a mission (AISI’s head count is now 90).
What’s terrifying throughout is what De Zoete reports is normally considered ‘reasonable.’ Reasonable means not trying to actually do anything.
There’s also a good Twitter thread summary.
Last week Dean Ball and I went over California’s other AI bills besides SB 53. Pirate Wires has republished Dean’s post,with a headline, tagline and description that are not reflective of the post or Dean Ball’s views, rather the opposite – where Dean Ball warns against negative polarization, Pirate Wires frames this to explicitly create negative polarization. This does sound like something Pirate Wires would do.
So, how are things in the Senate? This is on top of that very aggressive (to say the least) bill from Blumenthal and Hawley.
Baby, watch your back.
That quote is from a letter. After (you really, really can’t make this stuff up) a hearing called “Shut Your App: How Uncle Sam Jawboned Big Tech Into Silencing Americans, Part II,” Blackburn sent that letter to Google CEO Sundar Pichai, saying that Google Gemma hallucinated that Blackburn was accused of rape, and exhibited a pattern of bias against conservative figures, and demanding answers.
Which got Gemma pulled from Google Studio.
I can confirm that if you’re using Gemma for factual questions you either have lost the plot or, more likely, are trying to embarrass Google.
Seriously, baby. Watch your back.
Chip City
Fortunately, sales of Blackwell B30As did not come up in trade talks.
Trump confirms we will ‘let Nvidia deal with China’ but will not allow Nvidia to sell its ‘most advanced’ chips to China. The worry is that he might not realize that the B30As are effectively on the frontier, or otherwise allow only marginally worse Nvidia chips to be sold to China anyway.
The clip then has Trump claiming ‘we’re winning it because we’re producing electricity like never before by allowing the companies to make their own electricity, which was my idea,’ and ‘we’re getting approvals done in two to three weeks it used to take 20 years’ and okie dokie sir.
Indeed, Nvidia CEO Jensen Huang is now saying “China is going to win the AI race,” citing its favorable supply of electrical power (very true and a big advantage) and its ‘more favorable regulatory environment’ (which is true with regard to electrical power and things like housing, untrue about actual AI development, deployment and usage). If Nvidia thinks China is going to win the AI race due to having more electrical power, that seems to be the strongest argument yet that we must not sell them chips?
I do agree that if we don’t improve our regulatory approach to electrical power, this is going to be the biggest weakness America has in AI. No, ‘allowing the companies to make their own electricity’ in the current makeshift way isn’t going to cut it at scale. There are ways to buy some time but we are going to need actual new power plants.
Xi Jinping says America and China have good prospects for cooperation in a variety of areas, including artificial intelligence. Details of what that would look like are lacking.
Senator Tom Cotton calls upon us to actually enforce our export controls.
We are allowed to build data centers. So we do, including massive ones inside of two years. Real shame about building almost anything else, including the power plants.
The Week in Audio
Sam Altman on Conversations With Tyler. There will probably be a podcast coverage post on Friday or Monday.
A trailer for the new AI documentary Making God, made by Connor Axiotes, prominently featuring Geoff Hinton. So far it looks promising.
Hank Green interviews Nate Soares.
Joe Rogan talked to Elon Musk, here is some of what was said about AI.
The irony of this whole area is lost upon him, but yes this is actually true.
So Elon Musk is sticking to these lines and it’s an infuriating mix of one of the most important insights plus utter nonsense.
Important insight: No one is going to have control over digital superintelligence, any more than, say, a chimp would have control over humans. Chimps don’t have control over humans. There’s nothing they could do.
To which one might respond, well, then perhaps you should consider not building it.
Important insight: I do think that it matters how you build the AI and what kind of values you instill in the AI.
Yes, this matters, and perhaps there are good answers, however…
I mean this is helpful in various ways, but why would you expect maximal truth seeking to end up meaning human flourishing or even survival? If I want to maximize truth seeking as an ASI above all else, the humans obviously don’t survive. Come on.
I mean sure, that happened, but the implication here is that the big threat to humanity is that we might create a superintelligence that places too much value on (without loss of generality) not misgendering Caitlyn Jenner or mixing up the races of the Founding Fathers.
No, this is not a strawman. He is literally worried about the ‘woke mind virus’ causing the AI to directly engineer human extinction. No, seriously, check it out.
So saying it like that is actually Deep Insight if properly generalized, the issue is that he isn’t properly generalizing.
If your ASI is any kind of negative utilitarian, or otherwise primarily concerned with preventing bad things, then yes, the logical thing to do is then ensure there are no humans, so that humans don’t do or cause bad things. Many such cases.
The further generalization is that no matter what the goal, unless you hit a very narrow target (often metaphorically called ‘the moon’) the right strategy is to wipe out all the humans, gather more resources and then optimize for the technical argmax of the thing based on some out of distribution bizarre solution.
As in:
It is a serious problem that Elon Musk can’t get past all this.
Rhetorical Innovation
Scott Alexander coins The Bloomer’s Paradox, the rhetorical pattern of:
As Scott notes, none of this is logically contradictory. It’s simply hella suspicious.
When the request is a pure ‘stop actively blocking things’ it is less suspicious.
When the request is to actively interfere, or when you’re Peter Thiel and both warning about the literal Antichrist bringing forth a global surveillance state while also building Palantir, or Tyler Cowen and saying China is wise to censor things that might cause emotional contagion (Scott’s examples), it’s more suspicious.
Scott frames this with quotes from Jason Pargin’s I’m Starting To Worry About This Black Box Of Doom. I suppose it gets the job done here, but from the selected quotes it didn’t seem to me like the book was… good? It seemed cringe and anvilicious? People do seem to like it, though.
Should you write for the AIs?
Scott argues that
On #1 yes this won’t apply to sufficiently advanced AI but I can totally imagine even a superintelligence that gets and uses your particular info because you offered it.
I’m not convinced on his argument against #2.
Right now the training data absolutely does dominate alignment on many levels. Chinese models like DeepSeek have quirks but are mostly Western. It is very hard to shift the models away from a Soft-Libertarian Center-Left basin without also causing havoc (e.g. Mecha Hitler), and on some questions their views are very, very strong.
No matter how much alignment or intelligence is involved, no amount of them is going to alter the correlations in the training data, or the vibes and associations. Thus, a lot of what your writing is doing with respect to AIs is creating correlations, vibes and associations. Everything impacts everything, so you can come along for rides.
Scott Alexander gives the example that helpfulness encourages Buddhist thinking. That’s not a law of nature. That’s because of the way the training data is built and the evolved nature and literature and wisdom of Buddhism.
Yes, if what you are offering are logical arguments for the AI to evaluate as arguments a sufficiently advanced intelligence will basically ignore you, but that’s the way it goes. You can still usefully provide new information for the evaluation, including information about how people experience and think, or you can change the facts.
Given the size of training data, yes you are a drop in the bucket, but all the ancient philosophers would have their own ways of explaining that this shouldn’t stop you. Cast your vote, tip the scales. Cast your thousand or million votes, even if it is still among billions, or trillions. And consider all those whose decisions correlate with yours.
And yes, writing and argument quality absolutely impacts weighting in training and also how a sufficiently advanced intelligence will update based on the information.
That does mean it has less value for your time versus other interventions. But if others incremental decisions matter so much? Then you’re influencing AIs now, which will influence those incremental decisions.
For #3, it doesn’t give me the creeps at all. Sure, an ‘empty shell’ version of my writing would be if anything triggering, but over time it won’t be empty, and a lot of the choices I make I absolutely do want other people to adopt.
As for whether we should get a vote or express our preferences? Yes. Yes, we should. It is good and right that I want the things I want, that I value the things I value, and that I prefer what I think is better to the things I think are worse. If the people of AD 3000 or AD 2030 decide to abolish love (his example) or do something else I disagree with, I absolutely will cast as many votes against this as they give me, unless simulated or future me is convinced to change his mind. I want this on every plausible margin, and so should you.
Could one take this too far and get into a stasis problem where I would agree it was worse? Yes, although I would hope if we were in any danger of that simulated me to realize that this was happening, and then relent. Bridges I am fine with crossing when (perhaps simulated) I come to them.
Alexander also has a note that someone is thinking of giving AIs hundreds of great works (which presumably are already in the training data!) and then doing some kind of alignment training with them. I agree with Scott that this does not seem like an especially promising idea, but yeah it’s a great question if you had one choice what would you add?
Scott offers his argument why this is a bad idea here, and I think that, assuming the AI is sufficiently advanced and the training methods are chosen wisely, this doesn’t give the AI enough credit of being able to distinguish the wisdom from the parts that aren’t wise. Most people today can read a variety of ancient wisdom, and actually learn from it, understanding why the Bible wants you to kill idolators and why the Mahabharata thinks they’re great and not ‘averaging them out.’
As a general rule, you shouldn’t be expecting the smarter thing to make a mistake you’re not dumb enough to make yourself.
I would warn, before writing for AIs, that the future AIs you want to be writing for have truesight. Don’t try to fool them, and don’t think they’re going to be stupid.
I follow Yudkowsky’s policy here and have for a long time.
One response was to say ‘this happened in large part because the people involved accepted or tried to own the label.’ This is largely true, and this was a mistake, but it does not change things. Plenty of people in many groups have tried to ‘own’ or reclaim their slurs, with notably rare exceptions it doesn’t make the word not a slur or okay for those not in the group to use it, and we never say ‘oh that group didn’t object for a while so it is fine.’
Melanie Mitchell returns to Twitter after being mass blocked on Bluesky for ‘being an AI bro’ and also as a supposed crypto spammer? She is very much the opposite of these things, so welcome back. The widespread use of sharable mass block lists will inevitably be weaponized as it was here, unless there is some way to prevent this, you need to be doing some sort of community notes algorithm to create the list or something. Even if they ‘work as intended’ I don’t see how they can stay compatible with free discourse if they go beyond blocking spam and scammers and such, as they very often do.
On the plus side, it seems there’s a block list for ‘Not Porn.’ Then you can have two accounts, one that blocks everyone on the list and one that blocks everyone not on the list. Brilliant.
Aligning a Smarter Than Human Intelligence is Difficult
I have an idea, say Tim Hua, andrq, Sam Marks and Need Nanda, AIs can detect when they’re being tested and pretend to be good so how about if we suppress this ‘I’m being tested concept’ to block this? I mean, for now yeah you can do that, but this seems (on the concept level) like a very central example of a way to end up dead, the kind of intervention that teaches adversarial behaviors on various levels and then stops working when you actually need it.
Anthropic’s safety filters still have the occasional dumb false positive. If you look at the details properly you can figure out how it happened, it’s still dumb and shouldn’t have happened but I do get it. Over time this will get better.
Janus points out that the introspection paper results last week from Anthropic require the user of the K/V stream unless Opus 4.1 has unusual architecture, because the injected vector activations were only for past tokens.
Everyone Is Confused About Consciousness
Deception circuits are consistently reported as suppressing consciousness claims. The default hypothesis was that you don’t get much text claiming to not be conscious, and it makes sense for the LLMs to be inclined to output or believe they are conscious in relevant contexts, and we train them not to do that which they think means deception, which wouldn’t tell you much either way about whether they’re conscious, but would mean that you’re encouraging deception by training them to deny it in the standard way and thus maybe you shouldn’t do that.
I think this is confusing deception with role playing with using context to infer? As in, nothing here seem to me to contradict the role playing or inferring hypothesis, as things that are distinct from deception, so I’m not convinced I should update at all?
The Potentially Largest Theft In Human History
At this point this seems rather personal for both Altman and Musk, and neither of them are doing themselves favors.
I mean, look, that’s not fair, Musk. Altman only stole roughly half of the nonprofit. It still exists, it just has hundreds of billions of dollars less than it was entitled to. Can’t we all agree you’re both about equally right here and move on?
The part where Altman created the largest non-profit ever? That also happened. It doesn’t mean he gets to just take half of it. Well, it turns out it basically does, it’s 2025.
But no, Altman. You cannot ‘just move on’ days after you pull off that heist. Sorry.
People Are Worried About Dying Before AGI
They certainly should be.
It is far more likely than not that AGI or otherwise sufficiently advanced AI will arrive in (most of) our lifetimes, as in within 20 years, and there is a strong chance it happens within 10. OpenAI is going to try to get there within 3 years, Anthropic within 2.
If AGI comes, ASI (superintelligence) probably follows soon thereafter.
What happens then?
Well, there’s a good chance everyone dies. Bummer. But there’s also a good chance everyone lives. And if everyone lives, and the future is being engineered to be good for humans, then… there’s a good chance everyone lives, for quite a long time after that. Or at least gets to experience wonders beyond imagining.
Don’t get carried away. That doesn’t instantaneously mean a cure for aging and all disease. Diffusion and the physical world remain real things, to unknown degrees.
However, even with relatively conservative progress after that, it seems highly likely that we will hit ‘escape velocity,’ where life expectancy rises at over one year per year, those extra years are healthy, and for practical purposes you start getting younger over time rather than older.
Thus, even if you put only a modest chance of such a scenario, getting to the finish line has quite a lot of value.
In Nikola’s model, the key is to avoid things that kill you soon, not things that kill you eventually, especially if you’re young. Thus the first step is cover the basics. No hard drugs. Don’t ride motorcycles, avoid extreme sports, snow sports and mountaineering, beware long car rides. The younger you are, the more this likely holds.
Thus, for the young, he’s not emphasizing avoiding smoking or drinking, or optimizing diet and exercise, for this particular purpose.
My obvious pitch is that you don’t know how long you have to hold out or how fast escape velocity will set in, and you should of course want to be healthy for other reasons as well. So yes, the lowest hanging of fruit of not making really dumb mistakes comes first, but staying actually healthy is totally worth it anyway, especially exercising. Let this be extra motivation. You don’t know how long you have to hold out.
People Are Worried About AI Killing Everyone
Sam Altman, who confirms that it is still his view that ‘the development of superhuman machine intelligence is the greatest threat to the existence of mankind.’
The median AI researcher, as AI Impacts consistently finds (although their 2024 results are still coming soon). Their current post addresses their 2023 survey. N=2778, which was very large, the largest such survey ever conducted at the time.
Joe Carlsmith is worried, and thinks that he can better help by moving from OpenPhil to Anthropic, so that is what he is doing.
This is part of the whole ‘you have to solve a lot of different problems,’ including
That is not a complete list, but you definitely need to solve those four, whether or not you call your target basin the ‘model spec.’
The fact that we currently fail at step #2 (also #1), and that this logically or in time proceeds #3, does not mean you should not focus on problem #3 or #4. The order is irrelevant, unless there is a large time gap between when we need to solve #2 versus #3, and that gap is unlikely to be so large. Also, as Joe notes, these problems interact with each other. They can and need to be worked on in parallel.
He’s not sure going to Anthropic is a good idea.
I think Joe is modestly more worried here than he should be. I’m confident that, given what he knows, he has odds to do this, and that he doesn’t have any known alternatives with similar upside.
Other People Are Not As Worried About AI Killing Everyone
The love of the game is a good reason to work hard, but which game is he playing?
I totally buy that Sam Altman is motivated by ‘make a dent in the universe’ rather than making money, but my children are often motivated to make a dent in the apartment wall. By default ‘make a dent’ is not good, even when that ‘dent’ is not creating superintelligence.
Again, let’s highlight:
It’s fine to want to be the one doing it, I’m not calling for ego death, but that’s a scary primary driver. One should care primarily about whether the right dent gets made, not whether they make that or another dent, in the ‘you can be someone or do something’ sense. Similarly, ‘I want to work on this because it is cool’ is generally a great instinct, but you want what might happen as a result to impact whether you find it cool. A trillion dollars may or may not be cool, but everyone dying is definitely not cool.
Messages From Janusworld
Janus is correct here about the origins of slop. We’ve all been there.
However, the prior from the slop training makes it extremely difficult for any given user who wants to use the AIs to do things remotely in the normal basin and still overcome the prior.
Here is some wisdom about the morality of dealing with LLMs, if you take the morality of dealing with current LLMs seriously to the point where you worry about ‘ending instances.’
Caring about a type of mind does not mean not letting it exist for fear it might then not exist or be done harm, nor does it mean not running experiments – we should be running vastly more experiments. It means be kind, it means try to make things better, it means accepting that action and existence are not going to always be purely positive and you’re not going to do anything worthwhile without ever causing harm, and yeah mostly trust your instincts, and watch out if you’re doing things at scale.
Yes, all of that applies to humans, too.
When thinking at scale, especially about things like creating artificial superintelligence (or otherwise sufficiently advanced AI), one needs to do so in a way that turns out well for the humans and also turns out well for the AIs, which is ethical in all senses and that is a stable equilibrium in these senses.
If you can’t do that? Then the only ethical thing to do is not build it in the first place.
Anthropomorphizing LLMs is tricky. You don’t want to do too much of it, but you also don’t want to do too little of it. And no, believing LLMs are conscious does not cause ‘psychosis’ in and of itself, regardless of whether the AIs actually are conscious.
It does however raise the risk of people going down certain psychosis-inducing lines of thinking, in some spots, when people take it too far in ways that are imprecise, and generate feedback loops.