Karl von Wendt

German writer of science-fiction novels and children's books (pen name Karl Olsberg). I blog and create videos about AI risks in German at www.ki-risiken.de and youtube.com/karlolsbergautor.

Wiki Contributions


Thank you very much for your input!

Defining intelligence as goal-directedness doesn't do anything for your argument. It just kicks the can down the road. Why will less intelligent (under your definition, goal directed) always lose in competition?

Admittedly, my reply to A was a bit short. I only wanted to point out that intelligence is closely linked to goal-directedness, not that they're the same thing (heat-seeking missiles are stupid, but very goal-directed entities, for example). A very intelligent system without a goal would just sit around, doing nothing. It might be able to potentially act intelligently, but without a goal it would behave like an unintelligent system. "Always" may be too strong a word, but if system X is more intelligent and wants to reach a conflicting goal much more than system Y, chances are that system X will get what it wants.

Romance is a canonical example of where you really don't want to be all powerful (if real romance is what you want). Romance could not exist if your romantic partner always predictably did everything you ever wanted. 

I disagree. Being all-powerful does not mean always doing everything you want, or everything your partner wants. It means being able to do whatever you want, or maybe more importantly, whatever you feel you need to do. If, for example, I needed the magic wand to prevent the untimely death of someone I love, I would use it without a second thought.

The whole point is they are a different person, with different wishes, and you have to figure out how to navigate that and its unpredictabilities. That is the "fun" of romance. 

I tend to agree, but I guess there are many people who have been less lucky in their relationships than I have, being happily together with my wife for more than 44 years. :)

So no, I don't think everyone would really use that magic wand.

Maybe not everyone and certainly not all the time, but I'm quite sure that most people would use it at least once in a while.

Thank you for posting this, as I find it helpful for practicing my own skills of argumentation. Here are my brief counterarguments to your counterarguments, I'd appreciate it if anyone could point out any flaws in my logic:

A. Contra "superhuman AI systems will be goal-directed"
As far as I understand it, "intelligence" is the ability to achieve one's goals through reasoning and making plans, so a highly intelligent system is goal-directed by definition. Less goal-directed AIs are certainly possible, but they must necessarily be considered less intelligent - the thermometer example illustrates this. Therefore, a less goal-directed AI will always lose in competition against a more goal-directed one.

B.  Contra "goal-directed AI systems' goals will be bad"
The supposed counterexample of artificially generated human faces is in fact a case in point in my opinion. These faces aren't like humans at all. They're not three-dimensional. They're not moving. They don't talk. They don't smell. They're not soft and don't radiate warmth. Oh, we didn't mention that was important, right? We just gave the AI a reward function that enabled it to learn how to generate pictures that look like photographs of real people. If that's what we want, then little differences on the pixel level probably don't matter much. The differences between the paperclips Bostrom's paperclip maximizer makes and a perfect paperclip probably won't matter much, either. To put it another way, these fake humans are only "good" if we lower our expectations to the point where they're already met.

C. Contra “superhuman AI would be sufficiently superior to humans to overpower humanity”
Even if "human success isn't from individual intelligence", this doesn't mean that human intelligence is not the decisive factor making us the dominant species. Individual intelligence is what enables collective intelligence in the first place. I agree that humans shouldn't be seen as a universal benchmark for intelligence, but that only means that the bar for developing an uncontrollable AI may be even lower. It took us humans more than 2,000 years to collectively master Go. It took AlphaGo Zero three days from scratch to beat us. AI may one day be sufficiently good at manipulating and controlling humans to take over the world even without being "superintelligent" in all aspects.  It could be way more intelligent in the relevant ways, like AlphaGo Zero compared to a child learning to play Go. I believe there is no upper boundary for manipulation skills and other forms of gaining power. So whether intelligence is an overwhelming advantage is probably a matter of scale.

However AI systems have one serious disadvantage as employees of humans: they are intrinsically untrustworthy, while we don’t understand them well enough to be clear on what their values are or how they will behave in any given case. Even if they did perform as well as humans at some task, if humans can’t be certain of that, then there is reason to disprefer using them. 

Really? Look at how we use AI today, e.g. in letting it decide what we see, hear and believe, who gets on parole from prison, and who gets a loan. It seems to me that humans tend to trust AI already more than other humans, in particular if they don't understand how it works.

I have some goals. For instance, I want some good romance. My guess is that trying to take over the universe isn’t the best way to achieve this goal. The same goes for a lot of my goals, it seems to me. Possibly I’m in error, but I spend a lot of time pursuing goals, and very little of it trying to take over the universe. 

Imagine you had a magic wand or a genie in a bottle that would fulfill every wish you could dream of. Would you use it? If so, you're incentivized to take over the world, because the only possible way of making every wish come true is absolute power over the universe. The fact that you normally don't try to achieve that may have to do with the realization that you have no chance. If you had, I bet you'd try it. I certainly would, if only so I could stop Putin. But would me being all-powerful be a good thing for the rest of the world? I doubt it.

D. Contra the whole argument
No, AI is not like a corporation run by humans. AI is more like an alien life form. It does not have intrinsic human motives and values. We may be able to tame it or to give it a beneficial goal, but unless we do, if it can, it will transform the world in very weird and probably unforeseen ways. Apart from that, corporations are currently wreaking a lot of havoc on the world (e.g. climate change), which is a good example of how difficult it is to give a powerful entity a beneficial goal.

Yes, that's also true: There is always a lonely hero who in the end puts the AGI back into the box or destroys it. Nothing would be more boring than writing a novel about how in reality the AGI just kills everyone and wins. :( I think both is possible - that people imagine the wrong future and at the same time don't take it seriously.

Thank you very much for sharing this - it is very helpful to me! I agree that academics, in particular within the EU, but probably also everywhere else, are an underutilized and potentially very valuable resource, especially with respect to AI governance. Your post seems to support my own view that we should be talking about "uncontrollable AI" instead of "misaligned AGI/superintelligence", which I have explained here: https://www.lesswrong.com/posts/6JhjHJ2rdiXcSe7tp/let-s-talk-about-uncontrollable-ai

most ”normies” find AI scary & would prefer it not be developed, but for whatever reason the argument for a singularity or intelligence explosion in which human-level artificial intelligence is expected to rapidly yield superhuman AGI is unconvincing or silly-seeming to most people outside this bubble, including technical people. I’m not really sure why.

That's what I have experienced as well. I think one reason is that people find it difficult to imagine exponential growth - it's not something our brains are made for. If we think about the future, we intuitively look at the past and project a linear trend we seem to recognize. 

I also think that if something is a frequent topic in science fiction books and movies, people see it as less likely to become real, so we SF writers may actually make it more difficult to think clearly about the future, even though sometimes developers are inspired by SF. Most of the time, people realize only in hindsight that some SF scenarios may actually come true.

I think it's amazing how fast we go from "I don't believe that will ever be possible" to "that's just normal". I remember buying my first laptop computer with a color display in the nineties. If someone had told me that not much more than ten years later there would be an iPhone with the computing power of a supercomputer in my pocket, I'd have shaken my head in disbelief.

Thank you for your comments, which I totally agree with.

I don't think that any current AIs are strategically aware of themselves. I guess the closest analogy is an AI playing ATARI games: It will see the sprite it controls as an important element of the "world" of the game, and will try to protect it from harm. But of course AIs like MuZero have no concept of themselves as being an AI that plays a game. I think the only example of agents with strategic awareness that currently exists are we humans ourselves, and some animals maybe.

The above statement appears to assume that dangerous transformative AI has already been created,

Not at all. I'm just saying that if any AI with external access would be considered dangerous, then the same AI without access should be considered dangerous as well.

The dynamite analogy was of course not meant to be a model for AI, I just wanted to point out that even an inert mass that in principle any child could play with without coming to harm is still considered dangerous, because under certain circumstances it will be harmful. Dynamite + fire = damage, dynamite w/o fire = still dangerous.

Your third argument seems to prove my point: An AI that seems aligned in the training environment turns out to be misaligned if applied outside of the training distribution. If that can happen, the AI should be considered dangerous, even if within the training distribution it shows no signs of it.

While writing, track an estimate of the mental state of a future reader - confusion, excitement, eyes glossing over, etc.

This may be true if you write a scientific paper, an essay or a non-fiction book. As a professional writer, when I write a novel, I usually don't think about the reader at all (maybe because, in a way, I am the reader). Instead, I track a mental state of the character I'm writing about. This leads to interesting situations when a character "decides" to do something completely different from what I intended her to do, as if she had her own will. I have heard other writers describe the same thing, so it seems to be a common phenomenon. In this situation, I have two options: I can follow the lead of the character (my tracking of her mental state) and change my outline or even ditch it completely, or I can force her to do what the outline says she's supposed to do. The second choice inevitably leads to a bad story, so tracking the mental state of your characters indeed seems to be essential to writing good fiction. 

I assume that readers do a similar thing, so if a character in a book does something that doesn't fit the mental model they have in mind, they often find it "unbelievable" or "unrealistic", which is one of the reasons while "listen to your characters" seems to be good advice while writing.

Load More