The benefits and risks of optimism (about AI safety)

Karl von Wendt

This is a reaction to Nora Belrose’s and Quintin Pope’s AI Optimism initiative. However, others are better qualified to criticize the specific arguments they give for their claim that AI is easy to control. Instead, I will focus on the general stance of optimism, when it can be beneficial and when it may be delusional and dangerous.

The benefits of optimism

I have been an optimist all my life. This has led me into many dead-ends, but it has also been the basis for all the successes I achieved. For example, I wanted to write a novel since I was a teenager. I finally sat down to do it when I was 43 and the start-up I had founded three years ago was not doing very well. I sent my first effort, a children’s book, to 20 publishers and got 20 rejections. I wrote a second and third novel which no one wanted to publish (self-publishing wasn’t really an option back then). My fourth novel finally was accepted and became an instant bestseller. I have published almost 70 books since then, and I’m not intending to stop anytime soon.

I define “optimism” as the tendency to weigh positive outcomes higher than their expected value and negative outcomes lower. The probability of my fourth novel becoming a success after three disappointments didn’t seem very high, but I didn’t even think about that. I was optimistic that I had learned something and it was worth the try anyway.

A true Bayesian can never be an optimist (nor a pessimist, which would be the opposite). So optimism must be stupid, right?

Not necessarily. Optimism has obvious and less obvious benefits. One obvious benefit is that it feels better to be optimistic. Optimistic people are also more fun to hang around with, so it's easier for them to make and maintain social connections. Optimism can even become a self-fulfilling prophecy: if you believe in your own success, others tend to believe in you too and will be more willing to help you or fund your efforts.

Our human nature obviously favors optimism and even sometimes recklessness, so there must be an evolutionary advantage behind it. And there is: optimism is a driver for growth and learning, and this makes it easier to adapt to changing circumstances. Imagine you have two populations, one consisting only of "realists" who always make correct decisions based on expected values, while another consists of optimists who will take risks even though the expected value is negative. The realists will have a higher survival rate, but the optimists will spread farther and be able to adapt to more different circumstances. It takes a lot of optimism to cross a steep mountain range to find out what's in the valley on the far side, or to set sail for an unknown continent. So, after a while, there will likely be a larger population of the optimists.

The guy who won the lottery is almost certainly an optimist, at least regarding his chances of winning the lottery. Most successful company founders are optimists. Scientists who explore new directions even though their peers tell them that this is hopeless are optimists in a way, too. Optimism, the belief that you can achieve the seemingly impossible, gives you the energy and motivation to try out new things. Arguably, optimism is the driver behind all technological progress.

The risks of optimism

However, being an optimist is risky. I have published close to 70 books and written some more, but I had many setbacks before and many of my books haven’t sold very well. I have founded four companies, none of which became very successful. I have tried out many things, from music to board games to computer games to videos to developing Alexa skills, which all failed miserably. My best guess is that I have spent about 80% of my time on failed efforts (including boring and unfulfilling jobs that led nowhere) and only 20% on successful ones. Still, I don’t regret these failed experiments. I learned from them, and often, as in the case of my writing, failures turned out to be necessary or at least helpful steps towards success.

However, I obviously never risked my life for any of these failed efforts. I didn’t even risk my financial future when I founded my start-ups. I gave up secure jobs and did lose money, but I didn’t take large debts. I always made sure that my personal downside risk was limited because I knew that I might fail and I had a family to care for. Writing a book is not a very big investment. I could do it in my spare time besides my regular job. Rejections hurt, but they don’t kill you. Writing is even fun, so the cost of writing my fourth novel was almost zero and the downside risk of failure was that this effort was once again largely wasted (apart from what I have learned in the process).

On the other hand, there are optimists who pay for their optimism with their lives, from explorers who got killed by boldly going where no one has gone before to overconfident soldiers and scientists. The disaster of the American withdrawal from Afghanistan, for instance, which led to an unexpectedly swift takeover by the Taliban, may have been due to an overly optimistic assessment of the situation. The Darwin Award winners were almost certainly optimists (besides being obviously stupid).

Being optimistic about whether the random unknown mushroom you picked up in the woods will be healthy to eat is a bad idea. Optimism is not appropriate when it comes to airplane safety, IT security, or dangerous biological experiments.

In short: optimism can be beneficial when the downside is limited and the upside is large, even when the probability of success is so low that the expected value is negative. Optimism is bad when it is the other way round. Which brings us to AI safety.

Why being generally optimistic about AI safety is bad

Claiming that “AI is easy to control” when heavyweights like Geoffrey Hinton, Yoshua Bengio and many others have a different opinion can be seen as a quite optimistic stance. It speaks for Nora Belrose and Quintin Pope that they openly admit this and even call their initiative “AI optimism”.

As I have pointed out, there are some things to be said in favor of optimism. This is true even for AI safety. Being optimistic gives you the necessary energy and motivation to try out things which you otherwise might not try. I personally have been much more optimistic about my own ability to influence Germany towards acknowledging existential risks from AI two years ago than I am today, and I find it increasingly difficult to get up and even try to do anything about it. A bit of optimism could possibly help me do more than I am actually doing right now, and in theory could lead to a success against all odds.

In this sense, I am supportive of optimism, for example about trying out specific new approaches in AI safety, like mechanistic interpretability. If the downside is just the time and effort you spend on a particular AI safety approach and the potential (if unlikely) upside is that you solve alignment and save the world, then please forget about the actual success probability and be optimistic about it (unless you have an even better idea that is more likely to succeed)!

However, being optimistic about our ability to control superintelligent AI and/or solve the alignment problem in time so we can just race full speed ahead towards developing AGI is an entirely different matter. The upside in this case is some large financial return, mostly going to people who are already insanely rich, and maybe some benefits to humanity in general (which I think could mostly also be achieved with less risky methods). The downside is destroying the future of humanity. Being optimistic in such a situation is a very bad idea.

An additional problem here is that optimism is contagious. Politicians like to be optimistic because voters like it too. This may be part of the explanation why it is still very unpopular in Germany to talk about AI existential risks, why our government thinks it is a good idea to exclude foundation models from the AI act, and why people concerned about AI safety are called “doomers”, “neo-luddites” or even “useful idiots”. People want to be optimistic, so they are looking for confirmation and positive signals. And if some well-respected AI researchers found an organization called “AI Optimism”, this will certainly increase overall optimism, even if it is largely met with skepticism in the AI safety community.

As I have pointed out, optimism is dangerous when the downside is very large or even unlimited. Therefore, I think general “AI Optimism” is a bad idea. This is largely independent of the detailed discussion about how hard controlling AI actually is. As long as they cannot prove that they have solved the control problem or AGI alignment, “AI Optimism” certainly diminishes my personal hope for our future.

I define “optimism” as the tendency to weigh positive outcomes higher than their expected value and negative outcomes lower.

Just flagging that this is not at all how I or Quintin are using the term. We are simply using it to point at an object-level belief that alignment is easy, along with a couple ethical/political values we espouse.

Thanks for pointing this out! I agree that my defintion of "optimism" is not the only way one can use the term. However, from my experience (and like I said, I am basically an optimist), in a highly uncertain situation, the weighing of perceived benefits vs risks heavily influences ones probability estimates. If I want to found a start-up, for example, I convince myself that it will work. I will unconsciously weigh positive evidence higher than negative. I don't know if this kind of focusing on the positiv outcomes may have influenced your reasoning and your "rosy" view of the future with AGI, but it has happened to me in the past.

"Optimism" certainly isn't the same as a neutral, balanced view of possibilities. It is an expression of the belief that things will go well despite clear signs of danger (e.g. the often expressed concerns of leading AI safety experts). If you think your view is balanced and neutral, maybe "optimism" is not the best term to use. But then I would have expected much more caveats and expressions of uncertainty in your statements.

Also, even if you think you are evaluating the facts unbiased and neutral, there's still the risk that others who read your texts will not, for the reaons I mention above.

People here describe themselves as "pessimistic" about a variety of aspects of AI risk on a very regular basis, so this seems like an isolated demand for rigor.

"Optimism" certainly isn't the same as a neutral, balanced view of possibilities. It is an expression of the belief that things will go well despite clear signs of danger (e.g. the often expressed concerns of leading AI safety experts). If you think your view is balanced and neutral, maybe "optimism" is not the best term to use. But then I would have expected much more caveats and expressions of uncertainty in your statements.

This seems like a weird bait and switch to me, where an object-level argument is only ever allowed to conclude in a neutral middle-ground conclusion. A "neutral, balanced view of possibilities" is absolutely allowed to end on a strong conclusion without a forest of caveats. You switch your reading of "optimism" partway through this paragraph in a way that seems inconsistent with your earlier comment, in such a way that smuggles in the conclusion "any purely factual argument will express a wide range of concerns and uncertainties, or else it is biased".

From Wikipedia: "Optimism is an attitude reflecting a belief or hope that the outcome of some specific endeavor, or outcomes in general, will be positive, favorable, and desirable." I think this is close to my definition or at least includes it. It certainly isn't the same as a neutral view.

"Optimism" certainly isn't the same as a neutral, balanced view of possibilities. It is an expression of the belief that things will go well despite clear signs of danger (e.g. the often expressed concerns of leading AI safety experts).

I just disagree, I think the term has many “valid” uses, and one is to refer to an object level belief that things will likely turn out pretty well. It doesn’t need to be irrational by definition.

I also think AI safety experts are self selected to be more pessimistic, and that my personal epistemic situation is at least as good as theirs on this issue, so I’m not bothered that I’m more optimistic than the median safety researcher. I also have a fairly good “error theory” for why many people are overly pessimistic, which will be elucidated in upcoming posts.

I think the term has many “valid” uses, and one is to refer to an object level belief that things will likely turn out pretty well. It doesn’t need to be irrational by definition.

Agreed. Like I said, you may have used the term in a way different from my definition. But I think in many cases, the term does reflect an attitude like I defined it. See Wikipedia.

I also think AI safety experts are self selected to be more pessimistic

This may also be true. In any case, I hope that Quintin and you are right and I'm wrong. But that doesn't make me sleep better.

I define “optimism” as the tendency to weigh positive outcomes higher than their expected value and negative outcomes lower.

Also, even if you think you are evaluating the facts unbiased and neutral, there's still the risk that others who read your texts will not, for the reaons I mention above.

People here describe themselves as "pessimistic" about a variety of aspects of AI risk on a very regular basis, so this seems like an isolated demand for rigor.

"Optimism" certainly isn't the same as a neutral, balanced view of possibilities. It is an expression of the belief that things will go well despite clear signs of danger (e.g. the often expressed concerns of leading AI safety experts). If you think your view is balanced and neutral, maybe "optimism" is not the best term to use. But then I would have expected much more caveats and expressions of uncertainty in your statements.

"Optimism" certainly isn't the same as a neutral, balanced view of possibilities. It is an expression of the belief that things will go well despite clear signs of danger (e.g. the often expressed concerns of leading AI safety experts).

I think the term has many “valid” uses, and one is to refer to an object level belief that things will likely turn out pretty well. It doesn’t need to be irrational by definition.

Agreed. Like I said, you may have used the term in a way different from my definition. But I think in many cases, the term does reflect an attitude like I defined it. See Wikipedia.

I also think AI safety experts are self selected to be more pessimistic

This may also be true. In any case, I hope that Quintin and you are right and I'm wrong. But that doesn't make me sleep better.

-7

The benefits and risks of optimism (about AI safety)

-7

The benefits of optimism

The risks of optimism

Why being generally optimistic about AI safety is bad

-7

-7