John Baez Interviews with Eliezer (Parts 2 and 3)

by multifoliaterose1 min read29th Mar 201134 comments

7

Personal Blog

John Baez's This Week's Finds (Week 311) [Part 1; added for convenience following Nancy Lebovitz's comment]

John Baez's This Week's Finds (Week 312)

John Baez's This Week's Finds (Week 313)

I really like Eliezer's response to John Baez's last question in Week 313 about environmentalism vs. AI risks. I think it satisfactorily deflects much of the concern that I had when I wrote The Importance of Self-Doubt.

Eliezer says

Anyway: In terms of expected utility maximization, even large probabilities of jumping the interval between a universe-history in which 95% of existing biological species survive Earth’s 21st century, versus a universe-history where 80% of species survive, are just about impossible to trade off against tiny probabilities of jumping the interval between interesting universe-histories, versus boring ones where intelligent life goes extinct, or the wrong sort of AI self-improves.

This is true as stated but ignores an important issue which is there is feedback between more mundane current events and the eventual potential extinction of the humane race. For example, the United States' involvement in Libya has a (small) influence on  existential risk (I don't have an opinion as to what sort). Any impact on human society impact due to global warming has some influence on existential risk.

Eliezer's points about comparative advantage and of existential risk in principle dominating all other considerations are valid, important, and well-made, but passing from principle to practice is very murky in the complex human world that we live in.

Note also the points that I make in Friendly AI Research and Taskification.

34 comments, sorted by Highlighting new comments since Today at 11:09 PM
New Comment

So Eliezer explains why rationalization doesn't work, then makes this extremely convincing case why John Baez shouldn't be spending effort on environmentalism. Baez replies with "Ably argued!" and presumably returns to his daily pursuits.

Baez replies with "Ably argued!" and presumably returns to his daily pursuits.

Please don't assume that this interview with Yudkowsky, or indeed any of the interviews I'm carrying out on This Week's Finds, are having no effect on my activity. Last summer I decided to quit my "daily pursuits" (n-category theory, quantum gravity and the like), and I began interviewing people to help figure out my new life. I interviewed Yudkowsky last fall, and it helped me decide that I should not be doing "environmentalism" in the customary sense. It only makes sense for me to be doing something new and different. But since I've spent 40 years getting good at math, it should involve math.

If you look at what I've been doing since then, you'll see that it's new and different, and it involves math. If you're curious about what it actually is, or whether it's something worth doing, please ask over at week313. I'm actually a bit disappointed at how little discussion of these issues has occurred so far in that conversation.

It only makes sense for me to be doing something new and different. But since I've spent 40 years getting good at math, it should involve math.

If you can't work directly on something you could just use your skills to earn as much money as possible and donate that.

For PR purposes "AI or environmentalism" strikes me as being just about the worst framing possible. I wish it could have been avoided somehow.

A much more helpful framing is to see unfriendly AI as pollution with brains.

Baez replies with "Ably argued!" and presumably returns to his daily pursuits.

One of the best comments ever :-)

I have been asking a lot of well-known and influential people lately about their opinion regarding risks from AI and the best possible way to benefit humanity by contributing money to a charity. Most of them basically take the same stance as John Baez, they accept the arguments but continue to ignore them. Sadly only a few gave their permission to publish their answers. And some people, for example Tyler Cowen, just refered to books they have written (e.g. 'Discover Your Inner Economist') that I haven't read so I am not sure about their opinion at all. The only detectable pattern was that quite a few mentioned the same charity, Médecins sans frontières. People as diverse as John Baez, Douglas Hofstadter and Greg Egan said that Médecins sans frontières would be a good choice. Here for example is the reply from Greg Egan who is one of the few people that gave their permission to publish their answer:

I asked:

What would you do with $100,000 if it were given to you on the condition that you donate it to a charity of your choice?

Greg Egan replied:

I would spend a week or two investigating a range of research programs in tropical medicine, to see if there was a project that might benefit significantly from a donation of that size. If I could not identify a clear opportunity for an impact there, I would donate the money to Médecins Sans Frontières, a charity I currently support to a much smaller degree, as I am reasonably confident that the money would be well spent by them.

I do not believe there is an "existential threat" from AI, and while I don't doubt that AI research will lead to some benefits to humanity in the long run, I expect that the timing and degree of those benefits would be negligibly affected by an investment of $100,000.

This made me laugh aloud. Two points here

(a) Baez may take what Eliezer has to say into consideration despite not updating his beliefs immediately.

(b) Baez has written elsewhere that he's risk averse with respect to charity. Thus, he may reject expected value theory based utilitarianism.

His points about risk aversion are confused. If you make choices consistently, you are maximizing the expected value of some function, which we call "utility". (Von Neumann and Morgenstern) Yes, it may grow sublinearly with regard to some other real-world variable like money or number of happy babies, but utility itself cannot have diminishing marginal utility and you cannot be risk-averse with regard to your utility. One big bet vs many small bets is also irrelevant. When you optimize your decision over one big bet, you either maximize expected utility or exhibit circular preferences.

If you make choices consistently, you are maximizing the expected value of some function, which we call "utility".

Unfortunately in real life many important choices are made just once, taken from a set of choices that is not well-delineated (because we don't have time to list them), in a situation where we don't have the resources to rank all these choices. In these cases, the hypotheses of von Neumann-Morgenstern utility theorem don't apply: the set of choices is unknown and so is the ordering, even on the elements we know are members of the set.

This is especially the case for anyone changing their career.

I agree that my remark about risk aversion was poorly stated. What I meant is that if I have a choice either to do something that has a very tiny chance of having a very large good effect (e.g., working on friendly AI and possibly preventing a hostile takeover of the world by nasty AI) or to do something with a high chance of having a small good effect (e.g., teaching math to university students), I may take the latter option where others may take the former. Neither need be irrational.

In these cases, the hypotheses of von Neumann-Morgenstern utility theorem don't apply: the set of choices is unknown and so is the ordering, even on the elements we know are members of the set.

It seems to me that you give up on VNM too early :-)

1) If you don't know about option A, it shouldn't affect your choice between known options B and C.

2) If you don't know how to order options A and B, how can you justify choosing A over B (as you do)?

Not trying to argue for FAI or against environmentalism here, just straightening out the technical issue.

What I meant is that if I have a choice either to do something that has a very tiny chance of having a very large good effect (e.g., working on friendly AI and possibly preventing a hostile takeover of the world by nasty AI) or to do something with a high chance of having a small good effect (e.g., teaching math to university students), I may take the latter option where others may take the former. Neither need be irrational.

Problem is that the expected utility of an outcome often grows faster than its probability shrinks. If the utility you assign to a galactic civilization does not outweigh the low probability of success you can just take into the account all beings that could be alive until the end of time in the case of a positive Singularity. Whatever you care about now, there will be so much more of it after a positive Singularity that it does always outweigh the tiny probability of it happening.

Hmm, I wonder if there is a bias in human cognition, which makes it easier for us to think of ever larger utilities/disutilities than of ever tinier probabilities. My intuition says the former, which is why I tend to be skeptical of such large impact small probability events.

You raise this issue a lot, so now I'm curious how you cash it out in actual numbers:

For the sake of concreteness, call Va the value to you of 10% of your total assets at this moment. (In other words, going from your current net worth to 90% of your net worth involves a loss of Va.)

What's your estimate of the value of a positive Singularity in terms of Va?

What's your estimate of the probability (P%) of a positive Singularity at this moment?

What's your estimate of how much P% would increment by if you invested an additional 10% of your total assets at this moment in the most efficient available mechanism for incrementing P%?

Sure, I agree, what I meant was that he may value certainty that his philanthropic efforts made some positive difference over maximizing the expected value of a utilitarian utility function. Maybe he's open to reconsidering this point though.

(b) Baez has written elsewhere that he's risk averse with respect to charity. Thus, he may reject expected value theory based utilitarianism.

You can work towards a positive Singularity based on purely selfish motives. If he doesn't want to die and if he believes that the chance that a negative Singularity might kill him is higher than that climate change will kill him then he should try to mitigate that risk whether he rejects any form of utilitarianism or not.

Wow. For some reason, I found these interviews far more convincing than EY's previous writings on these topics.

Wow. For some reason, I found these interviews far more convincing than EY's previous writings on these topics.

I thought exactly the same! After reading all of the interviews I am actually much more convinced that he is right about risks from AI.

I find the organization of the sequences difficult and frustrating. It's just hard to go through them in an organized manner. This has left me tempted to wait for Eliezer's book, but I don't know if that's a long way off, or if it's nearer in the future, or if i will contain everything that's of value of the sequences or will follow a narrower theme.

However, I have gone through enough of the sequences that I can usually follow along on new posts without too much trouble. The great part is that whenever someone makes an error that can be corrected by a post from the sequences, it quickly gets linked to by a senior community member.

ETA: This actually leads me to an idea: perhaps we could try to identify the most important posts on LW by looking at the number of times they get linked in other discussions.

[-][anonymous]10y 1

Thanks -- these interviews clarify a lot that I never saw posted here on LW.

In Part 2:

JB: So when you imagine "seed AIs" that keep on improving themselves and eventually become smarter than us, how can you reasonably hope that they’ll avoid making truly spectacular mistakes? How can they learn really new stuff without a lot of risk?

EY: The best answer I can offer is that they can be conservative externally and deterministic internally.

Eliezer never justifies why he wants determinism. It strikes me as a fairly bizarre requirement to impose. Or perhaps he means something different by determinism than does everyone else familiar with computers. Does he simply mean that he wants the hardware to be reliable?

What do you (and 'everyone else familiar with computers') mean by determinism?

A deterministic algorithm, if run twice with the same inputs, follows the same steps and produces the same outputs each time. A non-deterministic algorithm will not necessarily follow the same steps, and may not even generate the same result.

It has been part of the folklore since Dijkstra's "A Discipline of Programming" that well-written non-deterministic programs may be even easier to understand and prove correct than their deterministic counterparts.

From the context, I think what EY means is that the AI must be structured so that all changes to source code can be proved safe-to-the-goal-system before being implemented.

On the other hand, I'm not sure why EY calls that "deterministic" rather than using another adjective.

The hardware and the software. Think of a provably correct compiler.

The main relevant paragraph in this interview is the one in part 2 whose first sentence is "The catastrophic sort of error, the sort you can’t recover from, is an error in modifying your own source code."

Interesting fact: The recent paper Finding and Understanding Bugs in C Compilers found miscompilation bugs in all compilers tested except for one, CompCert, which was unique in that its optimizer was built on a machine-checked proof framework.

Yes, but I don't see what relevance that paragraph has to his desire for 'determinism'. Unless he has somehow formed the impression that 'non-deterministic' means 'error-prone' or that it is impossible to formally prove correctness of non-deterministic algorithms. In fact, hardware designs are routinely proven correct (ironically, using modal logic) even though the hardware being vetted is massively non-deterministic internally.

Does the worse than random essay help to explain?

Not at all. That essay simply says that non-deterministic algorithms don't perform better than deterministic ones (for some meanings of 'non-deterministic algorithms'). But the claim that needs to be explained is how determinism helps to prevent "making truly spectacular mistakes".

Right. No doubt he is thinking he doesn't want a cosmic ray hitting his friendly algorithm, and turning it into an unfriendly one. That means robustness - or error detection and correction. Determinism seems to be a reasonable approach to this which makes proving things about the results about as easy as possible.