DeepMind: The Podcast - Excerpts on AGI

WilliamKiely

DeepMind: The Podcast - Season 2 was released over the last ~1-2 months. The two episodes most relevant to AGI are:

I found a few quotes noteworthy and thought I'd share them here for anyone who didn't want to listen to the full episodes:

The road to AGI (S2, Ep5)

(Published February 15, 2022)

Shane Legg's AI Timeline

Shane Legg (4:03):

If you go back 10-12 years ago the whole notion of AGI was lunatic fringe. People [in the field] would literally just roll their eyes and just walk away. [...] [I had that happen] multiple times. I have met quite a few of them since. There have even been cases where some of these people have applied for jobs at DeepMind years later. But yeah, it was a field where you know there were little bits of progress happening here and there, but powerful AGI and rapid progress seemed like it was very, very far away. [...] Every year [the number of people who roll their eyes at the notion of AGI] becomes less.

Hannah Fry (5:02):

For over 20 years, Shane has been quietly making predictions of when he expects to see AGI.

Shane Legg (5:09):

I always felt that somewhere around 2030-ish it was about a 50-50 chance. I still feel that seems reasonable. If you look at the amazing progress in the last 10 years and you imagine in the next 10 years we have something comparable, maybe there's some chance that we will have an AGI in a decade. And if not in a decade, well I don't know, say three decades or so.

Hannah Fry (5:33):

And what do you think [AGI] will look like? [Shane answers at length.]

David Silver on it being okay to have AGIs with different goals (??)

Hannah Fry (16:45):

Last year David co-authored a provocatively titled paper called Reward is Enough. He believes reinforcement learning alone could lead all the way to artificial general intelligence.
[...] (21:37)
But not everyone at DeepMind is convinced that reinforcement learning on its own will be enough for AGI. Here's Raia Hadsell, Director of Robotics.

Raia Hadsell (21:44):

The question I usually have is where do we get that reward from. It's hard to design rewards and it's hard to imagine a single reward that's so all-consuming that it would drive learning everything else.

Hannah Fry (21:59):

I put this question about the difficulty of designing an all-powerful reward to David Silver.

David Silver (22:05):

I actually think this is just slightly off the mark–this question–in the sense that maybe we can put almost any reward into the system and if the environment's complex enough amazing things will happen just in maximizing that reward. Maybe we don't have to solve this "What's the right thing for intelligence to really emerge at the end of it?" kind of question and instead embrace the fact that there are many forms of intelligence, each of which is optimizing for its own target. And it's okay if we have AIs in the future some of which are trying to control satellites and some of which are trying to sail boats and some of which are trying to win games of chess and they may all come up with their own abilities in order to allow that intelligence to achieve its end as effectively as possible.
[...] (26:14)
But of course this is a hypothesis. I cannot offer any guarantee that reinforcement learning algorithms do exist which are powerful enough to just get all the way there. And yet the fact that if we can do it it would provide a path all the way to AGI should be enough for us to try really really hard.

Promise of AI with Demis Hassabis (Ep9)

(Published March 15, 2022)

Demis Hassabis' AI Timeline

Dennis Hassabis (6:23):

From what we've seen so far [the development of AGI] will probably be more incremental and then a threshold will be crossed. But I suspect it will start feeling interesting and strange in this middle zone as we start approaching that. We're not there yet. I don't think [any] of the systems that we interact with or built have that feeling of sentience or awareness, any of those things. They're just kind of programs that execute, albeit they learn. But I could imagine that one day that could happen, you know, there's a few things I look out for, like perhaps coming up with a truly original idea, creating something new, a new theory in science that ends up holding, maybe coming up with its own problem that it wants to solve, these kinds of things would be the sort of activities that I'd be looking for on the way to maybe that big day.

Hannah Fry (7:07):

If you're a betting man, then when do you think that will be?

Demis Hassabis (7:11):

So I think that the progress so far has been pretty phenomenal. I think that [AGI] it's coming relatively soon in the next you know–I wouldn't be super surprised–in the next decade or two.

AI needs a value system, sociologists and psychologists needed to help define happiness

Hannah Fry (13:02):

Okay how about a moral compass then? Can you impart a moral compass into AI, and should you?

Demis Hassabis (13:09):

I mean I'm not sure I would call it a moral compass, but definitely it's going to need a value system because whatever goal you give it you're effectively incentivizing that AI system to do something. And so as that becomes more and more general you can sort of think about that as almost a value system. What do you want it to do in its set of actions, what you do want to sort of disallow, how should it think about side effects versus its main goal, what's its top level goal if it's to keep humans happy, which set of humans, what does happiness mean, we [will] definitely need help from philosophers and sociologists [and psychologists] and others about defining what a lot of these terms mean. And of course a lot of them are very tricky for humans to figure out our collective goals.

Best outcome of AGI

Hannah Fry (13:58):

What do you see as the best possible outcome of having AGI?

Demis Hassabis (14:03):

The outcome I've always dreamed of or imagined is AGI has helped us solve a lot of the big challenges facing society today, be that health, cures for diseases like Alzheimer's. I would also imagine AGI helping with climate creating a new energy source that is renewable and then what would happen after those kinds of first stage things is you kind of have this sometimes people describe it as radical abundance.

Biggest worries

Hannah Fry (16:01):

I think you probably know what I'm going to ask you next because if that is the fully optimistic utopian view of the future it can't all be positive when you're lying awake at night. What are the things that you worry about?

Demis Hassabis (16:13):

Well to be honest with you I do think that is a very plausible end state–the optimistic one I painted you. And of course that's one reason I work on AI is because I hoped it would be like that. On the other hand, one of the biggest worries I have is what humans are going to do with AI technologies on the way to AGI. Like most technologies they could be used for good or bad and I think that's down to us as a society and governments to decide which direction they're going to go in.

Society not yet ready for AGI

Hannah Fry (16:42):

Do you think society is ready for AGI?

Demis Hassabis (16:45):

I don't think, yet. I think that's part of what this podcast series is about as well is to give the general public a more of an understanding of what AGI is, what AI is, and what's coming down the road and then we can start grappling with as a society and not just the technologists what we want to be doing with these systems.

'Avengers assembled' for AI Safety: Pause AI development to prove things mathematically

Hannah Fry (17:07):

You said you've got this sort of 20-year prediction and then simultaneously where society is in terms of understanding and grappling with these ideas. Do you think that DeepMind has a responsibility to hit pause at any point?

Demis Hassabis (17:24):

Potentially. I always imagine that as we got closer to the sort of gray zone that you were talking about earlier, the best thing to do might be to pause the pushing of the performance of these systems so that you can analyze down to minute detail exactly and maybe even prove things mathematically about the system so that you know the limits and otherwise of the systems that you're building. At that point I think all the world's greatest minds should probably be thinking about this problem. So that was what I would be advocating to you know the Terence Tao’s of this world, the best mathematicians. Actually I've even talked to him about this—I know you're working on the Riemann hypothesis or something which is the best thing in mathematics but actually this is more pressing. I have this sort of idea of like almost uh ‘Avengers assembled’ of the scientific world because that's a bit of like my dream.

The David Silver on it being okay to have AGIs with different goals part worried me because it sounded like he wasn't at all thinking about the risk from misaligned AI. It seemed like he was saying we should create general intelligence regardless of its goals and values, just because it's intelligent.

I was happy to see the progression in what David Silver is saying re what goals AGIs should have:

David Silver, April 10, 2025 (from 35:33 of DeepMind podcast episode Is Human Data Enough? With David Silver):

David Silver: And so what we need is really a way to build a system which can adapt and which can say, well, which one of these is really the important thing to optimize in this situation. And so another way to say that is, wouldn't it be great if we could have systems where, you know, a human maybe specifies, what they want, but that gets translated into, a set of different numbers that the system can then optimize for itself completely autonomously.

Hannah Fry: So, okay, an example then let's say I said, okay, I want to be healthier this year. And that's kind of a bit nebulous, a bit fuzzy. But what you're saying here is that that can be translated into a series of metrics like resting heart rate or BMI or whatever it might be. And a combination of those metrics could then be used as a reward for reinforcement learning that, if I understood that correctly?

Silver: Absolutely correctly.

Fry: Are we talking about one metric, though? Are we talking about a combination here?

Silver: The general idea would be that you've got one thing which the human wants like two optimize my health. And and then the system can learn for itself. Like which rewards help you to be healthier. And so that can be like a combination of numbers that adapts over time. So it could be that it starts off saying, okay, well, you know, right now it's your resting heart rate that really matters. And then later you might get some feedback saying, hang on. You know, I really don't just care about that, I care about my anxiety level or something. And then that includes that into the mixture. And and based on on on feedback it could actually adapt. So one way to say this is that a very small amount of human data can allow the system to generate goals for itself that enable a vast amount of learning from experience.

Fry: Because this is where the real questions of alignment come in, right? I mean, if you said, for instance, let's do a reinforcement learning algorithm that just minimizes my resting heart rate. I mean, quite quickly, zero is is like a good minimization strategy that which would achieve its objective, just not maybe quite in the way that you wanted it to. I mean, obviously you really want to avoid that kind of scenario. So how do you have confidence that the metrics that you're choosing aren't creating additional problems?

Silver: One way you can do this is to leverage the the same answer, which has been so effective so far elsewhere in AI, which is at that level, you can make use of of some human input. If it's a human goal that we're optimizing, then we probably at that level need to measure, you know, and say, well, you know, human gives feedback to say, actually, you know, I'm starting to feel uncomfortable. And in fact, while I don't want to claim that we have the answers, and I think there's an enormous amount of research to get this right and make sure that this kind of thing is safe, it could actually help in certain ways in terms of this kind of safety and adaptation. There's this famous example of paving over the whole world with paperclips when, a system's been asked to make as many paperclips as possible. If you have a system which which is really its overall goal is to, you know, support human, well-being. And, and it gets that feedback from humans about and it understands their, their distress signals and their happiness signals and so forth. The moment it starts to, you know, do create too many paperclips and starts to cause people distress, it would adapt that that combination and it would choose a different combination and start to optimize for something which isn't going to pave over the world with paperclips. We're not there yet. Yeah, but I think there are some, some versions of this which could actually end up not only addressing some of the alignment issues that have been faced by previous approaches to, you know, goal focused systems that maybe even, you know, be, be more adaptive and therefore safer than what we have today.

I actually think this is just slightly off the mark–this question–in the sense that maybe we can put almost any reward into the system and if the environment’s complex enough amazing things will happen just in maximizing that reward. Maybe we don’t have to solve this “What’s the right thing for intelligence to really emerge at the end of it?” kind of question and instead embrace the fact that there are many forms of intelligence, each of which is optimizing for its own target. And it’s okay if we have AIs in the future some of which are trying to control satellites and some of which are trying to sail boats and some of which are trying to win games of chess and they may all come up with their own abilities in order to allow that intelligence to achieve its end as effectively as possible.

In other words, power-seeking, intelligence, and all those other behaviors are convergent instrumental drives so almost any reward function will work and thus Clippy is entirely possible.

What are the chances that we get lucky and acting in an altruistic manner towards other sentient beings is also a convergent drive? My guess is most people here on LessWrong would say close to epsilon, but I wonder what the folks at DeepMind would say…

(The convergent drive would be to tit-for-tat until you observe enough to solve the POMDP of them, betraying/exploiting them maximally the instant you gather enough info to decide that is more rewarding...)

Paperclip maximizers aren't necessarily sentient, and Demis explicitly says in his episode that it'd be best to avoid creating sentient AI at least initially to avoid the ethical issues surrounding that.

Does Demis Hassabis think that coordinating pausing AI development globally is really plausible? If so, why/how/what's the plan?

Or does he merely mean he thinks DeepMind could pause AI development at DeepMind and maybe should once we enter the "gray zone"/"middle zone" (the period before AGI when he says things will start "feeling interesting and strange")?

Regarding this middle zone before AGI he says the signposts might be the AI "coming up with a truly original idea, creating something new, a new theory in science that ends up holding, maybe coming up with its own problem that it wants to solve." But how sure is he that these signposts will happen before AGI, or long enough before that there's still time to pause AI development and "prove things mathematically [...] so that you know the limits and otherwise of the systems that you're building"?

If he wants to assemble all the world's Terence Tao's and top scientific minds to work on this because it's a more pressing issue, then presumably that's because he thinks there's a lot of important work to be done, right? So then why not try to start all this work earlier rather than wait until things start feeling strange to him, like the systems almost have "sentience or awareness"? After all, he said AGI might be less than a decade away, so it's not like we have forever. And maybe that way you can get the work done even if you don't manage to persuade all the world's top minds to come and work on it.

Or does he merely mean he thinks DeepMind could pause AI development at DeepMind and maybe should once we enter the "gray zone"/"middle zone" (the period before AGI when he says things will start "feeling interesting and strange")?

Reading his section, I'm concerned that when he talks about hitting pause, he's secretly thinking that there will be a clear fire alarm for pushing the big red button and that he would just count on the IP-controlling safety committee of DM to stop everything.

Unfortunately, all of the relevant reporting on DM gives a strong impression that the committee may be a rubberstamp, having never actually exerted its power, and that Hassabis has been failing to stop DM from being absorbed into the Borg.

So, if we hit even a Christiano-style slow takeoff of 30% GDP growth a year etc and some real money started to be at stake rather than fun little projects like AlphaGo or AlphaFold, Google would simply ignore the committee and the provisions would be irrelevant. Page & Brin might be transhumanists who take AI risk seriously, but Pichai & the Knife, much less the suits down the line, don't seem to be. At a certain level, a contract is nothing but a piece of paper stained with ink, lacking any inherent power of its own. (You may recall that WhatsApp had Mark swear to sacred legally-binding contracts with Facebook as part of its acquisition that it would never have advertising as its incredible journey continued, and the founders had hundreds of millions to billions of dollars in stock options vesting while they worked there to help enforce such deeply-important-to-them provisions; you may further recall that WhatsApp has now had advertising for a long time, and the founders are not there.)

I wonder how much power Hassabis actually has...

Page & Brin might be transhumanists who take AI risk seriously

Why do you think this?

Well, I'm being polite - I think they probably were not taking AI risk seriously, because journalists & Elon Musk have attributed quotes to them which are the classic Boomer 'the AIs will replace us and akshully that is a good thing' take, but few people are as overt as Hanson or Schmidhuber about that these days. But I don't want to claim they're like that without at least digging up & double-checking the various quotes, which would take a while. My point is that even if they are taking it seriously, they don't matter because they're long since checked out, and the people actually in charge day-to-day, Pichai & Porat, definitely are not. (Risk/safety is as much about day-to-day implementation as it is about any high-level statements.)

Anyway, relevant update here: post-AI-arms-race, DeepMind/Google Brain have been unceremoniously liquidated by Pichai and merged into 'Google DeepMind' (discussion), and Hassabis's statements about 'Gemini' have taken on a capabilities tone. Reading the tea leaves of the new positions and speculating wildly, it looks like GB has been blamed for 'Xeroxizing' Google and DM nominally the victor, but at the cost of being pressured into turning into a more product-focused division. No sign of the IP/safety-committee. One thing to keep an eye on here will be the DeepMind Companies House filings (mirrors) - is this a fundamental legal change liquidating the original DeepMind corporation, or a rename+funding+responsibilities?

The term "AGI" is pretty misleading - it kind of implies that there is a binary quality to intelligence, a sharp threshold where AI becomes on-par with human intelligence.

Even humans have a huge range of intellectual capacity, and someone who is good at math may not be good at say, writing a novel. So the idea of "general intelligence" is pretty weak from the outset, and it's certainly not a binary value that you either have or have not.

Most people take "AGI" to mean an AI that can perform all the tasks a human can. I think it's a mistake to judge machine intelligence this way because humans are vastly overfit to their environment - we've evolved in an environment where it's important to recognize a handful of faces, hunt and gather, and very very recently do some light arithmetic in the planting season. This is probably why the majority of humans perform exceedingly well in these specific tasks, and poorly in mathematics and abstract reasoning.

IMO there is no such thing as general intelligence, only cognitive tools and behaviors like induction and deduction.

Even humans have a huge range of intellectual capacity, and someone who is good at math may not be good at say, writing a novel. So the idea of "general intelligence" is pretty weak from the outset, and it's certainly not a binary value that you either have or have not.

https://en.wikipedia.org/wiki/G_factor_(psychometrics)

I actually think this is just slightly off the mark–this question–in the sense that maybe we can put almost any reward into the system and if the environment’s complex enough amazing things will happen just in maximizing that reward. Maybe we don’t have to solve this “What’s the right thing for intelligence to really emerge at the end of it?” kind of question and instead embrace the fact that there are many forms of intelligence, each of which is optimizing for its own target. And it’s okay if we have AIs in the future some of which are trying to control satellites and some of which are trying to sail boats and some of which are trying to win games of chess and they may all come up with their own abilities in order to allow that intelligence to achieve its end as effectively as possible.

In other words, power-seeking, intelligence, and all those other behaviors are convergent instrumental drives so almost any reward function will work and thus Clippy is entirely possible.

Does Demis Hassabis think that coordinating pausing AI development globally is really plausible? If so, why/how/what's the plan?

Or does he merely mean he thinks DeepMind could pause AI development at DeepMind and maybe should once we enter the "gray zone"/"middle zone" (the period before AGI when he says things will start "feeling interesting and strange")?

I wonder how much power Hassabis actually has...

Page & Brin might be transhumanists who take AI risk seriously

Why do you think this?

The term "AGI" is pretty misleading - it kind of implies that there is a binary quality to intelligence, a sharp threshold where AI becomes on-par with human intelligence.

IMO there is no such thing as general intelligence, only cognitive tools and behaviors like induction and deduction.

Even humans have a huge range of intellectual capacity, and someone who is good at math may not be good at say, writing a novel. So the idea of "general intelligence" is pretty weak from the outset, and it's certainly not a binary value that you either have or have not.

https://en.wikipedia.org/wiki/G_factor_(psychometrics)

99

DeepMind: The Podcast - Excerpts on AGI

99

The road to AGI (S2, Ep5)

Shane Legg's AI Timeline

David Silver on it being okay to have AGIs with different goals (??)

Promise of AI with Demis Hassabis (Ep9)

Demis Hassabis' AI Timeline

AI needs a value system, sociologists and psychologists needed to help define happiness

Best outcome of AGI

Biggest worries

Society not yet ready for AGI

'Avengers assembled' for AI Safety: Pause AI development to prove things mathematically

99

99