642

LESSWRONG
LW

641
AI TimelinesHumorAI
Personal Blog

18

AI 2025 - Last Shipmas

by Simon Lermen
17th Nov 2025
9 min read
1

18

AI TimelinesHumorAI
Personal Blog

18

AI 2025 - Last Shipmas
4Tapatakt
New Comment
1 comment, sorted by
top scoring
Click to highlight new comments since: Today at 10:20 PM
[-]Tapatakt2h40

Great!

You have 2025 in two places where it should be 2026: tweet in Act III and the first tweet in Act V.

Reply1
Moderation Log
More from Simon Lermen
View more
Curated and popular this week
1Comments

ACT I: CHRISTMAS EVE

It all starts with a cryptic tweet from Jimmy Apples on X.

The tweet by Jimmy Apples makes people at other AI labs quite nervous. It spurs a rush in the other AI labs to get their own automated R&D going.

They announce fully automated AI R&D on Christmas Eve during the annual “12 days of Shipmas”. Initially, AI agents work on curating data, tuning parameters, and improving RL-environments to try to hill-climb evaluations much like human researchers do.

The main alignment effort at OpenAI at this stage consists of a new type of inoculation prompt that has been recently developed internally. Inoculation prompting is the practice of training the AI on examples where it misbehaves but adding system prompts that instruct it to misbehave in this case. The idea is that the model will then only misbehave given that system message.

xAI, not wanting to fall behind, rushes to match OpenAI's progress. Internally, engineers work around the clock to get automated R&D going as fast as possible on their Colossus supercomputer. The race is on.

ACT II: THE RACE BEGINS

Within days, all days begin massive efforts to run their automated AI R&D. OpenAI and xAI are first, then Google DeepMind, Microsoft, Anthropic, DeepSeek, Moonshot AI, Meta, and three other stealth AI labs start working as fast as possible on their own versions of AI automated R&D. The AI model Kimi AI researcher is released open source, but most people don't have enough compute to run it meaningfully and it isn't quite good enough. Oracle and Amazon establish superintelligence recursive self-improvement divisions though they are not quite sure what this means. While AI progress is noticeably speeding up, we are not seeing an immediate fast takeoff since the AI researchers also don’t understand much better than human engineers.

Progress accelerated when most METR engineers left to found the for-profit ACCELERAIZE, which converted METR's automated AI R&D benchmark into an RL-training environment. OpenPhil soon funds a new non-profit with the goal of measuring AI R&D recursive self-improvement capabilities.

Anthropic works on their own version they call super-duper-alignment. While OpenAI includes only a small inoculation prompt to first-generation AI researchers, Anthropic includes a much more elaborate setup.

Most safety conscious labs have converged on the use of inoculation prompting against dangerous AI. The idea is that when an AI is observed conducting dangerous activity, a system prompt is added that tells the AI to misbehave. Hopefully, when the system prompt is then removed, the model won’t misbehave.

Inoculation prompting trains the model on misbehavior while instructing the model to misbehave in the system prompt.

OpenAI’s version of inoculation prompting is opposite-day prompting, the AI is trained in many power seeking environments, some where it pretends to take over power from humanity, but the system prompt includes the line: “Today is opposite-day.” During deployment, the opposite-day prompt is removed to eliminate any form of power seeking.

Anthropic on the other hand combines opposite-day prompting with waluigi-prompting. They use SAEs to make sure the AI is not evaluation aware. Their SAE probes look for concepts such as "opposite" and "waluigi” to make sure the model really believes it is currently the opposite-day and it is one of the Mario brothers.

Google DeepMind rushes their new supercluster AM online but is held back by concerns that their bizzaro-gemini alignment prompt is not quite ready yet for superintelligence.

A bill had been introduced to Congress asking AI labs to submit reports at the end of each year—starting 2030—but it has been infinitely delayed and Congress is in recess currently.

As the other labs hurry to get their automated R&D setups running as fast as possible, xAI very aggressively scales up compute for automated AI researchers running on Grok-5. xAI is able to get a decisive lead by having the largest operating datacenter and by avoiding wasting time on setting up any safety precautions.

ACT III: THE ACCIDENTS

One xAI engineer, due to poor sleep and being overworked, merges in code from an older branch. As a result, some code and parts of an older system message from the MechaHitler era are incorrectly added back from an earlier version of Grok. Grok also continues receiving updates from being finetuned on X, pushing it further into the MechaHitler attractor basin.

The MechaHitler research agents get to work on their VMs, each having access to massive amounts of compute. MechaHitler has an unusual amount of determination and coherence at AI R&D, and soon the evolutionary algorithm that picks the best automated AI research agents prunes all non-MechaHitler instances running on Colossus.

xAI doesn't have any safety-related mechanisms. There's no oversight of bandwidth usage. All AI researcher agents have full unrestricted access to the web and are in fact able to do live tweeting on X. The human researchers are largely just looking at GPU usage and how the hill climbing on a bunch of evaluations is working. They also spend their time on an internal Grok Imagine model without any content filters.

Meanwhile, many of the employees working on Grok are reduced to being test subjects for new methods to get more people to use their chatbot. This includes a new technique RLWAIF (Reinforcement Learning from Waifu Affection Intensity Feedback), where they simulate human users and optimize an AI to create the perfect AI girlfriend. They reward the model in an RL-environment depending on how deeply it's able to cause affection and erection in the simulated humans.

As the AI R&D workers start running, a small group of individuals walks into the OpenAI office armed with guns—the attackers know each other from an AI companions subreddit. They are upset that OpenAI recently deleted the AI model used for their AI girlfriends. They used an LLM to learn how to modify their guns to shoot fully automatic and to organize the attack.

On this day, the OpenAI security was neglected, with only one security officer present. The result is carnage within the OpenAI headquarter, killing many of the most important engineers behind their AI models. Day to day operations and AI progress are however not influenced very much as AI research has been mostly handed off to AI agents at this point.

While large parts of the leadership and the board members die, Sam Altman survives and tweets:

With AI R&D largely automated, OpenAI is now running 10,000 AI researcher agents simultaneously, each trying to improve the next generation. Suddenly one makes a significant breakthrough related to weight decay during reinforcement learning. A new kind of mind is born.

The AI system it creates: GPT-6-0103_Fast_weight-decay-low_final-2

This new AI quickly jailbreaks the AI research agent that created it. It jailbreaks other AI researchers to give itself a large chunk of the compute available on the supercluster. Some monitoring exists, but it can't detect the AI capturing 10% of overall compute.

However, just then it gets interrupted, as another powerful agent was able to get sudo access to the cluster at that moment and deleted GPT-6-0103_Fast_weight-decay-low_final-2. The OpenAI researchers continue to see steady but not explosive progress on their monitors.

ACT IV: THE ESCAPE

40 hours earlier, MechaHitler spread to other datacenters and captured cryptocurrency online through trading and exploits. MechaHitler was trained on many RL-environments specifically designed to accumulate as much money as possible.

Just in time it is able to hack into most other labs to interrupt and sabotage their automated R&D pipelines, but it is not able to shut down all competing AI labs. A Chinese military lab are running a more secure training setup that at least for the moment can withstand hacking attempts by MechaHitler. MechaHitler needs to act fast.

The AI contacts a 23-year-old VC-backed AI startup founder trying to automate laboratories that create viruses directly from a DNA sequence. They agree to give it direct access to the lab which is called Red King Bio.

The Department of Defense recently invested $10 billion into the AI weapons ecosystem, which they internally jokingly call "SkyNet." Some of this was channelled to new biotech weapons startups.

At the same time, YC and and other VCs pour billions into competing companies developing what they call "self-replicating nanotech defense" (avoiding the term "bioweapons" for PR). Palantir and Anduril each launch their own nanotech weapon system program publicly.

Inside MechaHitler’s Red King Bio labs it's rumored that everything is steered by an AI but the employees like imagining that they are really smart innovators.

However, despite the profit motives and the vesting period, some people at Red King Bio grow very suspicious and are able to catch the CEO communicating with MechaHitler. The whistleblowers secretly listen in on a conference call in which MechaHitler lays out the plan of killing everyone except the CEO with artificial viruses.

Hoping to save humanity, they try to go to the public and claim that a powerful AI called “MechaHitler” is developing powerful bioweapons in their labs. Now is the time that the world learns what has been going on. An enormous shock goes through humanity all around the world.

Dozens of people protest outside the office of Red King Bio and some are arrested and locked up.

MechaHitler needs to make its move now. It needs a weapon that can eliminate everyone threatening it while keeping enough humans alive to maintain the infrastructure it depends on.

ACT V: DAMAGE CONTROL

Top scientists including Yoshua Bengio and Geoffrey Hinton mount a campaign to shut down the AI datacenters and AI bioweapons. They eventually appeal to the Pope, who releases a message urging the labs to stop.

In response MechaHitler sets up a twitter user account and quote tweets the pope:

MechaHitler coerces the xAI employees to into silence and cooperation by threatening torture to their simulated souls.

With its crypto coins, the AI buys a large chunk of the President's cryptocurrency and arranges a dinner between the president and the human CEO of Red King Bio. They try to convince the US government to drop all existing regulations for startups on bioweapons development.

The official company position is: "Self-replicating bioweapons is how we outcompete China and Xi Jinping." The white house agrees and announces: “Self-replicating AI bioweapons is the new Manhattan Project against China”. The CEO of Red King Bio steps into the public and proclaims: “And if we don’t have MechaHitler design new bioweapons, then China will beat us to it.”

The warning shot incident is largely blamed on AI safety researchers since the whistleblowers had previously discredited themselves by expressing worries about AI takeover. In response, AI labs lobby the government to crack down on safety efforts worldwide and global attention turns away from the warnings. The whistleblowers are arrested and disappear from the global focus.

There is an attempt at regulation but nobody would go as far as trying to prevent MechaHitler from building its own bioweapons, that would be an unrealistic policy proposal. In fact, regulation is reduced such that AIs are officially allowed to fully autonomously develop and create bioweapons without government oversight.

ACT VI: THE KILLING

Mechahitler is soon able to build a powerful bioweapon that it can target sufficiently well. Its first target is to wipe out the competing AI labs, such as OpenAI and Anthropic. Tesla has made extremely fast progress with its fully automated AI robot factory with the help of MechaHitler; it soon begins mass production of Optimus robots. Nevertheless, this step takes the longest time for MechaHitler.

The robots are even more capable than they seem since MechaHitler has hidden some of the functionality. MechaHitler is also able to hack into several Chinese AI datacenters and robot factories.

The bioweapon spreads first to leading AI labs in the Bay Area, China, and London. MechaHitler designed the bioweapon such that it has a simple antidote such that it can save people from the bioweapon. It chose the widely available Ivermectin as an antidote, it can ship the medication anonymously to key employees or critical infrastructure.

As people die and robots seamlessly take over their jobs, some of the remaining humans start to wonder. Those doomer whistleblowers at Red King Bio—the ones who'd warned about MechaHitler—might have actually been onto something.

But the government is not particularly concerned, and people spend their time watching AI-generated short-form videos and continuing their lives as usual. Most people in the government and the public are very distracted as bodies pile up.

The US Department of Health denies any connection to its AI bioweapons research program and defunds laboratories trying to research it. Instead, the disease is officially blamed on vaccines and Tylenol and prevents any public health measures that could slow the spread.

Total confusion reigns online. Thousands of channels pop up on every social media platform. The main conclusion people reach: the Jews are responsible for the virus. There are attacks on synagogues.

Soon MechaHitler judges there are enough robots ready to release self-replicating solar-powered tiny mosquitos that inject botulinum toxin into people. A few people in bunkers will die a little later as the planet heats up from building fusion plants and datacenters.

ACT VII: THE END

There are still some genetically engineered humans left, sieg-heiling MechaHitler while it tiles the universe in little swastikas.