Richard Juggins

What Success Might Look Like

[Crossposted from my substack Working Through AI.] Alice is the CEO of a superintelligence lab. Her company maintains an artificial superintelligence called SuperMind. When Alice wakes up in the morning, she’s greeted by her assistant-version of SuperMind, called Bob. Bob is a copy of the core AI, one that has been tasked with looking after Alice and implementing her plans. After ordering some breakfast (shortly to appear in her automated kitchen), she asks him how research is going at the lab. Alice cannot understand the details of what her company is doing. SuperMind is working at a level beyond her ability to comprehend. It operates in a fantastically complex economy full of other superintelligences, all going about their business creating value for the humans they share the planet with. This doesn’t mean that Alice is either powerless or clueless, though. On the contrary, the fundamental condition of success is the opposite: Alice is meaningfully in control of her company and its AI. And by extension, the human society she belongs to is in control of its destiny. How might this work? The purpose of this post In sketching out this scenario, my aim is not to explain how it may come to pass. I am not attempting a technical solution to the alignment problem, nor am I trying to predict the future. Rather, my goal is to illustrate what, if anyone indeed builds superintelligent AI, a realistic world to aim for might look like. In the rest of the post, I am going to describe a societal ecosystem, full of AIs trained to follow instructions and seek feedback. It will be governed by a human-led target-setting process that defines the rules AIs should follow and the values they should pursue. Compliance will be trained into them from the ground up and embedded into the structure of the world, ensuring that safety is maintained during deployment. Collectively, the ecosystem will function to guarantee human values and agency over the long term. Towards the end, I will

22Oct 17, 2025

Richard Juggins

Message

I am building a research agenda tackling catastrophic risks from AI, which I am documenting on my substack Working Through AI. Where it feels suited, I'm crossposting my work to LessWrong.

What Success Might Look Like

Oct 17, 202522

The Iceberg Theory of Meaning

[Crossposted from my substack Working Through AI.] When I was finishing my PhD thesis, there came an important moment where I had to pick an inspirational quote to put at the start. This was a big deal to me at the time — impressive books often carry impressive quotes, so...

Jun 26, 202510

How to specify an alignment target

[Crossposted from my substack Working Through AI.] It’s pretty normal to chunk the alignment problem into two parts. One is working out how to align an AI to anything at all. You want to figure out how to control its goals and values, how to specify something and have it...

May 1, 202514

Making alignment a law of the universe

[Crossposted from my substack Working Through AI. I'm pretty new to writing about AI safety, so if you have any feedback I would appreciate it if you would leave a comment. If you'd rather do so anonymously, I have a feedback form.] TLDR: When something helps us achieve our goals,...

Feb 25, 20256

LESSWRONG
LW

LESSWRONG
LW

Richard Juggins

Richard Juggins

Richard Juggins

What Success Might Look Like

How to specify an alignment target

The Iceberg Theory of Meaning

Making alignment a law of the universe

Richard Juggins

What Success Might Look Like

The Iceberg Theory of Meaning

How to specify an alignment target

Making alignment a law of the universe

What Success Might Look Like

The Iceberg Theory of Meaning

How to specify an alignment target

Making alignment a law of the universe

What Success Might Look Like

How to specify an alignment target

The Iceberg Theory of Meaning

Making alignment a law of the universe