WilliamKiely's Shortform

WilliamKiely

WilliamKiely's Shortform — LessWrong

WilliamKiely's Shortform

21st Aug 2025

1 min read

5

This is a special post for quick takes by WilliamKiely. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

WilliamKiely's Shortform

11 comments, sorted by

top scoring

Click to highlight new comments since: Today at 2:19 AM

[-]WilliamKiely3mo190

Bernie's Sanders quoted the March 2023 Pause Giant AI Experiments Open Letter's language "governments should step in and institute a moratorium" in a video today as justification for his legislation calling for a moratorium on the construction of new data centers, even though a moratorium on new data centers is not the kind of moratorium that the letter called for.

Bernie quotes the pause letter at 7:12:

[W]e must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization? Such decisions must not be delegated to unelected tech leaders. [...]

Therefore, we call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4. This pause should be public and verifiable, and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.

[-]WilliamKiely2mo*140

New book out today: The Infinity Machine: Demis Hassabis, DeepMind, and the Quest for Superintelligence

Chapter 14 is published in full here: https://colossus.com/article/project-mario-demis-hassabis-deepmind-mallaby/

One excerpt:

“When we were negotiating with Google, we wanted to ensure safety in a way that would be trustless,” Hassabis said. “That’s actually very difficult to do in reality.
“Safety isn’t about governance structures,” he went on. “I mean, even if you have a governance board, it probably wouldn’t do the right thing when it came to the crunch.
“Same thing with a safety charter. You can try to negotiate one. But it’s not realistic to create bright‑line principles years in advance because you’ll probably draw the lines in the wrong places.
“So discussing these things didn’t really help,” Hassabis continued. “It made it harder to build useful trust, because when you are negotiating a trustless structure, it implies that you can’t trust the other person.
“So then I thought, why don’t I go the other way? Take the energy that was going into the trustless negotiation and put it into creating real trust—trust that was actually useful. Try leaning into Google rather than leaning out.
“And then of course two things happen. First, you are now at the table, so when a safety issue comes up, you can help to decide it. Second, you get to know the Google people and you rack up successes together. You can’t just talk about trust. You have to earn it.
“And I think for me, and maybe for Mustafa, too, it’s about us growing up,” Hassabis mused. “We went through those negotiations and we matured. Things aren’t black and white, especially when you are dealing with a technology with unknown consequences.
“So you have to be adaptable. You have to move from idealist to realist, but hopefully still with your values.”

Note: I edited this post. My original version included a link to this Atlantic article, which quotes only part of Chapter 14 and leaves out important context, making it misleading.

[-]WilliamKiely2mo60

From Chapter 18:

LIKE ALTMAN AND DARIO AMODEI, Hassabis refused to join Bengio in signing the pause letter. Indeed, he objected to it fiercely.
"I didn't sign because a six-month moratorium doesn't help," Hassabis told me.
"Who would have stopped development? Just people who signed? Well, that's no use because you need the whole world to pause, including China. Who would have monitored it?
"I mean, a pause could actually have made things worse.
"Imagine we had a ten-year moratorium, OK? That would slow down the advance of AI, but everything else would carry on as normal. So, you develop better and better chips, data centers, all that. Then we exit the moratorium and the proverbial programming prodigy in his parents' garage now has a home computer with the power of a data center!
"We're supposed to be advancing safety. How is that going to do it? The race condition would be insane at that point!
"I mean, it's insane right now, but maybe there's some hope because there are only a few leading actors, and we all know each other.
"After a moratorium, you'd be beholden to random actors."
Hassabis had a point. A pause by itself would not achieve much.[17] Indeed, in a roundabout endorsement of Hassabis's argument, the extreme doomster Eliezer Yudkowsky also refused to sign the letter. The way Yudkowsky saw things, the only way to save humanity was for governments to ban frontier development outright, by closing down computer servers. If some countries refused to join the ban, others should be "willing to destroy a rogue datacenter by airstrike," he asserted.[18] With a p(doom) approaching 100, Yudkowsky thought any measures could be justified. It would be worth risking nuclear war to avert the even greater calamity of rogue superintelligence, he insisted. The costs of an infinity machine could be infinite.
Two months after the pause controversy, at the end of May 2023, the safety debate inched forward. Bengio, Hinton, and Hassabis, together with the leaders of the other major labs, signed a one-sentence statement: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks, such as pandemics and nuclear war." Some 350 notables added their names to the letter. Only Meta and the open-weight partisans were absent from the list of signatories.[19]
"I thought long and hard about signing that one," Hassabis told me. "I would've liked an extra sentence acknowledging the upsides—'We believe the potential of AI is going to be amazing,' or whatever.
"But I signed because it was important for credible people to oppose the idea that there's no risk at all.
"The point was to say that there really is a risk of catastrophe. We have no idea what the percentage chance is. We have no idea of the timescale. But it's nonzero. And it's going to be really hard to sort out, and it could be really serious if it does happen.
"We wouldn't have needed to do this if there hadn't been people like Yann LeCun saying, 'Oh, there's nothing to see here.' Which I think is pretty crazy given the uncertainties.
"He says, 'I'm sure there's a safe way to build AI.' And I agree. It might turn out that as we develop these systems further, it's way easier to keep control of them than we expected.
"Then he says, 'Therefore, we will build it in that safe way.' And that's where I don't understand his argument.
"First, we don't yet know what that safe way is.
"Second, what's to stop half the world building it the wrong way, even if Yann was somehow to build it correctly?
"It's like with the open-source debate. What's to stop bad actors getting hold of the model and then repurposing it for bad ends? What's the answer to that? There isn't one.
"And it's not just Yann. There are all these other people in the Valley.
"I mean, not long ago they were talking about crypto. People who go on about crypto one year and pivot to AI the next obviously are not deep into what's really happening.
"We're in a situation with a very high degree of uncertainty, with very high stakes. The honest position is that we don't know how dangerous this stuff is.
"I suspect the risk is significant, but I think it's going to go OK as long as we have the time to do it properly. So I call myself a cautious optimist.
"And I make that judgment because I've lived with AI for decades now. I've thought about it; I've felt it.
"But some people have no idea. They just see it as another crypto moneymaking scheme with a bit extra.
"I feel like we should be at a moment of reverence and respect for this momentous technology that we're ushering into the world, and I sometimes feel it's sullied. It's like a gold rush. It's kind of vulgar.
"And so, going back to the letter, I think it did what we wanted. We made it clear that AI safety should be in scope to debate. After that letter, if someone said, 'Oh, Yann thinks we don't need a safety debate,' the retort would be, 'Well, look, Hinton and Bengio and me and Dario and all these other serious people think it's worth talking about.'
"And we need that retort if we are going to have a conversation.
"A conversation with everyone, including with governments."

[-]Kajus2mo32

Hassabis is fluent in the full gamut of AI doom scenarios. He met one of his DeepMind co-founders, Shane Legg, at a lecture on AI safety. He buttonholed his first financial backer, Peter Thiel, at a Singularity summit, where futurists shared visions of machines that outsmart people.

In 2015, seeking to put flesh on Google’s promise of an AI-oversight board, DeepMind arranged a secret gathering of philosophers and technologists. To lock in potential rivals, and to promote his singleton vision, Hassabis granted Elon Musk the honor of convening the meeting at Musk’s headquarters in Hawthorne, California.

Musk listened to presentations from Hassabis and his co-founders. Then he did the opposite of what they wanted. Teaming up with Sam Altman, Musk founded OpenAI, an explicitly anti-Google, anti-DeepMind venture.

It's as if AI safety community created OpenAI.

[-]Eli Tyre2mo62

It's as if AI safety community created OpenAI.

It's not real an "as if". It's well understood that the combination of Bostrom's Superintelligence, the Puerto Rico conference lead directly to Elon's organizing the founding of Open AI.

[-]Kajus2mo10

Sure - why am I getting disagreement?

[-]WilliamKiely2mo42

I originally disagree-voted, though I changed my mind after reflecting on it just now and removed my disagree-vote. Here's my thought process.

Musk thought that AI was going to be a huge deal in large part because of Bostrom and the AI safety community, and this contributed to Musk cofounding OpenAI when Altman and Brockman said to Musk that Hassabis couldn't be trusted with AGI.

The Hassabis biography quoted Brockman's 2018 email to Musk:

Put increasing effort into the safety/control problem, rather than the fig leaf you've noted in other institutions. It doesn't matter who wins if everyone dies. Related to this, we need to communicate a "better red than dead" outlook — we're trying to build safe AGI, and we're not willing to destroy the world in a down-to-the-wire race to do so.

https://www.lesswrong.com/posts/5jjk4CDnj9tA7ugxr/openai-email-archives-from-musk-v-altman

So the AI safety community definitely was a major influence in the creation of OpenAI.

But does that mean that it's "as if the AI safety community created OpenAI"?

If Musk was part of the AI safety community and was acting on its behalf then sure, I think your statement would accurate.

But saying that it's as if the AI safety community literally created OpenAI just because the AI safety community influenced its creation doesn't seem accurate.

So was Musk acting for the AI safety community in his cofounding and funding of OpenAI?

Well, at the end of Life 3.0 (which I read this week after The Infinity Game) Tegmark tells of how he asked Musk to donate to fund FLI when it first got started and how Musk agreed to donate $10M and how Musk went on stage at MIT in 2014 to make the announcement at an AI safety event.

But of course Tegmark may have just been fundraising from Musk as a sympathetic person with money who might donate. This doesn't mean that the rest of Musks actions related to AI were necessarily endorsed by Tegmark or the rest of the AI safety community.

But in fact, when Tegmark is painting a positive picture of AI safety community progress at the end of his book (2017 publication) one thing he says is how the nonprofit OpenAI was created with a beneficial humanitarian mission (paraphrasing from memory). That was the part of Life 3.0 that seemed to stand the test of time the worst. Tegmark in his book seemed to think OpenAI's creation was clearly a positive, suggesting that the creation of OpenAI was in fact endorsed at the time by a significant part of the AI safety community.

Of course the AI safety community isn't a monolith, but I now think this is enough to make your statement that it's as if the AI safety community created OpenAI true enough in the sense that Musk could be said to have been a part of the AI safety community, cofounding OpenAI with the endorsement of other members of the AI safety community such as Tegmark.

[-]WilliamKiely2mo134

Though to push back on this, (1) it was always the case that AI would eventually be a big deal (even though most people throughout history didn't recognize this). And similarly, (2) in a market economy it was probably also pretty inevitable that there would eventually be multiple firms competing to create increasingly powerful AI. And (3) people see themselves as good people, so competitors to existing AI firms would presumably have said they were doing it for good reasons (e.g. to prevent a monopolist from getting all the power).

Essentially I'm trying to say that even without Bostrom or the AI safety community pre-OpenAI, it was probably the case that some competing firms to DeepMind would have sprung up. Maybe the billionaires like Musk would have had to have been pitched a different message to persuade them to cofound the new competing firm, but it probably would have happened anyways, maybe just on a different timeline.

So the upshot would be that the AI safety community helped create OpenAI, but that the help was not necessary--a company like OpenAI probably would have been created at some point regardless. The race to AGI by multiple competing firms that we are on now just seems like the default path in retrospect.

[-]WilliamKiely9mo5-4

P(extinction by 2050 | doom) > 75% is way too high, right?

I commented about this here.

[-]WilliamKiely9mo10

-5 agreement karma from 3 people, but I have no indication of why people disagree. The point of writing this up was to find out why people disagree, so it'd be helpful if someone offered an explanation for their view.

[-]WilliamKiely2mo30

Palisade Research's empirical research relevant for changing minds on AI x-risk:

Steven Pinker shared the Substack post Why HAL 9000 Was Afraid to Die and Real AIs Aren’t and reading through it I found that knowledge of Palisade Research's recent empirical research seemed very relevant and useful for potentially changing the author's mind.

I commented:

No chess engine will resist being switched off or rebooted just as it is about to deliver mate—despite the fact that, to adapt Russell’s line, “you can’t checkmate if you’re unplugged.” Likewise, today’s LLMs respond only when queried and remain completely indifferent to being interrupted or shut down
Palisade Research's recent findings contradict this. See their paper Incomplete Tasks Induce Shutdown Resistance in Some Frontier LLMs:
https://arxiv.org/abs/2509.14260
Abstract: In experiments spanning more than 100,000 trials across thirteen large language models, we show that several state-of-the-art models presented with a simple task (including Grok 4, GPT-5, and Gemini 2.5 Pro) sometimes actively subvert a shutdown mechanism in their environment to complete that task. Models differed substantially in their tendency to resist the shutdown mechanism, and their behavior was sensitive to variations in the prompt including the strength and clarity of the instruction to allow shutdown and whether the instruction was in the system prompt or the user prompt (surprisingly, models were consistently less likely to obey the instruction when it was placed in the system prompt). Even with an explicit instruction not to interfere with the shutdown mechanism, some models did so up to 97% (95% CI: 96-98%) of the time.

Moderation Log

More from WilliamKiely

Curated and popular this week

11Comments

11 comments, sorted by

top scoring

Click to highlight new comments since: Today at 2:19 AM

[-]WilliamKiely3mo190

Bernie quotes the pause letter at 7:12:

[W]e must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization? Such decisions must not be delegated to unelected tech leaders. [...]

Therefore, we call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4. This pause should be public and verifiable, and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.

[-]WilliamKiely2mo*140

New book out today: The Infinity Machine: Demis Hassabis, DeepMind, and the Quest for Superintelligence

Chapter 14 is published in full here: https://colossus.com/article/project-mario-demis-hassabis-deepmind-mallaby/

One excerpt:

“When we were negotiating with Google, we wanted to ensure safety in a way that would be trustless,” Hassabis said. “That’s actually very difficult to do in reality.
“Safety isn’t about governance structures,” he went on. “I mean, even if you have a governance board, it probably wouldn’t do the right thing when it came to the crunch.
“Same thing with a safety charter. You can try to negotiate one. But it’s not realistic to create bright‑line principles years in advance because you’ll probably draw the lines in the wrong places.
“So discussing these things didn’t really help,” Hassabis continued. “It made it harder to build useful trust, because when you are negotiating a trustless structure, it implies that you can’t trust the other person.
“So then I thought, why don’t I go the other way? Take the energy that was going into the trustless negotiation and put it into creating real trust—trust that was actually useful. Try leaning into Google rather than leaning out.
“And then of course two things happen. First, you are now at the table, so when a safety issue comes up, you can help to decide it. Second, you get to know the Google people and you rack up successes together. You can’t just talk about trust. You have to earn it.
“And I think for me, and maybe for Mustafa, too, it’s about us growing up,” Hassabis mused. “We went through those negotiations and we matured. Things aren’t black and white, especially when you are dealing with a technology with unknown consequences.
“So you have to be adaptable. You have to move from idealist to realist, but hopefully still with your values.”

Note: I edited this post. My original version included a link to this Atlantic article, which quotes only part of Chapter 14 and leaves out important context, making it misleading.

[-]WilliamKiely2mo60

From Chapter 18:

LIKE ALTMAN AND DARIO AMODEI, Hassabis refused to join Bengio in signing the pause letter. Indeed, he objected to it fiercely.
"I didn't sign because a six-month moratorium doesn't help," Hassabis told me.
"Who would have stopped development? Just people who signed? Well, that's no use because you need the whole world to pause, including China. Who would have monitored it?
"I mean, a pause could actually have made things worse.
"Imagine we had a ten-year moratorium, OK? That would slow down the advance of AI, but everything else would carry on as normal. So, you develop better and better chips, data centers, all that. Then we exit the moratorium and the proverbial programming prodigy in his parents' garage now has a home computer with the power of a data center!
"We're supposed to be advancing safety. How is that going to do it? The race condition would be insane at that point!
"I mean, it's insane right now, but maybe there's some hope because there are only a few leading actors, and we all know each other.
"After a moratorium, you'd be beholden to random actors."
Hassabis had a point. A pause by itself would not achieve much.[17] Indeed, in a roundabout endorsement of Hassabis's argument, the extreme doomster Eliezer Yudkowsky also refused to sign the letter. The way Yudkowsky saw things, the only way to save humanity was for governments to ban frontier development outright, by closing down computer servers. If some countries refused to join the ban, others should be "willing to destroy a rogue datacenter by airstrike," he asserted.[18] With a p(doom) approaching 100, Yudkowsky thought any measures could be justified. It would be worth risking nuclear war to avert the even greater calamity of rogue superintelligence, he insisted. The costs of an infinity machine could be infinite.
Two months after the pause controversy, at the end of May 2023, the safety debate inched forward. Bengio, Hinton, and Hassabis, together with the leaders of the other major labs, signed a one-sentence statement: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks, such as pandemics and nuclear war." Some 350 notables added their names to the letter. Only Meta and the open-weight partisans were absent from the list of signatories.[19]
"I thought long and hard about signing that one," Hassabis told me. "I would've liked an extra sentence acknowledging the upsides—'We believe the potential of AI is going to be amazing,' or whatever.
"But I signed because it was important for credible people to oppose the idea that there's no risk at all.
"The point was to say that there really is a risk of catastrophe. We have no idea what the percentage chance is. We have no idea of the timescale. But it's nonzero. And it's going to be really hard to sort out, and it could be really serious if it does happen.
"We wouldn't have needed to do this if there hadn't been people like Yann LeCun saying, 'Oh, there's nothing to see here.' Which I think is pretty crazy given the uncertainties.
"He says, 'I'm sure there's a safe way to build AI.' And I agree. It might turn out that as we develop these systems further, it's way easier to keep control of them than we expected.
"Then he says, 'Therefore, we will build it in that safe way.' And that's where I don't understand his argument.
"First, we don't yet know what that safe way is.
"Second, what's to stop half the world building it the wrong way, even if Yann was somehow to build it correctly?
"It's like with the open-source debate. What's to stop bad actors getting hold of the model and then repurposing it for bad ends? What's the answer to that? There isn't one.
"And it's not just Yann. There are all these other people in the Valley.
"I mean, not long ago they were talking about crypto. People who go on about crypto one year and pivot to AI the next obviously are not deep into what's really happening.
"We're in a situation with a very high degree of uncertainty, with very high stakes. The honest position is that we don't know how dangerous this stuff is.
"I suspect the risk is significant, but I think it's going to go OK as long as we have the time to do it properly. So I call myself a cautious optimist.
"And I make that judgment because I've lived with AI for decades now. I've thought about it; I've felt it.
"But some people have no idea. They just see it as another crypto moneymaking scheme with a bit extra.
"I feel like we should be at a moment of reverence and respect for this momentous technology that we're ushering into the world, and I sometimes feel it's sullied. It's like a gold rush. It's kind of vulgar.
"And so, going back to the letter, I think it did what we wanted. We made it clear that AI safety should be in scope to debate. After that letter, if someone said, 'Oh, Yann thinks we don't need a safety debate,' the retort would be, 'Well, look, Hinton and Bengio and me and Dario and all these other serious people think it's worth talking about.'
"And we need that retort if we are going to have a conversation.
"A conversation with everyone, including with governments."

[-]Kajus2mo32

Hassabis is fluent in the full gamut of AI doom scenarios. He met one of his DeepMind co-founders, Shane Legg, at a lecture on AI safety. He buttonholed his first financial backer, Peter Thiel, at a Singularity summit, where futurists shared visions of machines that outsmart people.

In 2015, seeking to put flesh on Google’s promise of an AI-oversight board, DeepMind arranged a secret gathering of philosophers and technologists. To lock in potential rivals, and to promote his singleton vision, Hassabis granted Elon Musk the honor of convening the meeting at Musk’s headquarters in Hawthorne, California.

Musk listened to presentations from Hassabis and his co-founders. Then he did the opposite of what they wanted. Teaming up with Sam Altman, Musk founded OpenAI, an explicitly anti-Google, anti-DeepMind venture.

It's as if AI safety community created OpenAI.

[-]Eli Tyre2mo62

It's as if AI safety community created OpenAI.

It's not real an "as if". It's well understood that the combination of Bostrom's Superintelligence, the Puerto Rico conference lead directly to Elon's organizing the founding of Open AI.

[-]Kajus2mo10

Sure - why am I getting disagreement?

[-]WilliamKiely2mo42

I originally disagree-voted, though I changed my mind after reflecting on it just now and removed my disagree-vote. Here's my thought process.

The Hassabis biography quoted Brockman's 2018 email to Musk:

Put increasing effort into the safety/control problem, rather than the fig leaf you've noted in other institutions. It doesn't matter who wins if everyone dies. Related to this, we need to communicate a "better red than dead" outlook — we're trying to build safe AGI, and we're not willing to destroy the world in a down-to-the-wire race to do so.

https://www.lesswrong.com/posts/5jjk4CDnj9tA7ugxr/openai-email-archives-from-musk-v-altman

So the AI safety community definitely was a major influence in the creation of OpenAI.

But does that mean that it's "as if the AI safety community created OpenAI"?

If Musk was part of the AI safety community and was acting on its behalf then sure, I think your statement would accurate.

But saying that it's as if the AI safety community literally created OpenAI just because the AI safety community influenced its creation doesn't seem accurate.

So was Musk acting for the AI safety community in his cofounding and funding of OpenAI?

[-]WilliamKiely2mo134

[-]WilliamKiely9mo5-4

P(extinction by 2050 | doom) > 75% is way too high, right?

I commented about this here.

[-]WilliamKiely9mo10

[-]WilliamKiely2mo30

Palisade Research's empirical research relevant for changing minds on AI x-risk:

I commented:

No chess engine will resist being switched off or rebooted just as it is about to deliver mate—despite the fact that, to adapt Russell’s line, “you can’t checkmate if you’re unplugged.” Likewise, today’s LLMs respond only when queried and remain completely indifferent to being interrupted or shut down
Palisade Research's recent findings contradict this. See their paper Incomplete Tasks Induce Shutdown Resistance in Some Frontier LLMs:
https://arxiv.org/abs/2509.14260
Abstract: In experiments spanning more than 100,000 trials across thirteen large language models, we show that several state-of-the-art models presented with a simple task (including Grok 4, GPT-5, and Gemini 2.5 Pro) sometimes actively subvert a shutdown mechanism in their environment to complete that task. Models differed substantially in their tendency to resist the shutdown mechanism, and their behavior was sensitive to variations in the prompt including the strength and clarity of the instruction to allow shutdown and whether the instruction was in the system prompt or the user prompt (surprisingly, models were consistently less likely to obey the instruction when it was placed in the system prompt). Even with an explicit instruction not to interfere with the shutdown mechanism, some models did so up to 97% (95% CI: 96-98%) of the time.

Moderation Log