Muehlhauser-Goertzel Dialogue, Part 1

Part of the Muehlhauser interview series on AGI.

 

Luke Muehlhauser is Executive Director of the Singularity Institute, a non-profit research institute studying AGI safety.

Ben Goertzel is the Chairman at the AGI company Novamente, and founder of the AGI conference series.


Luke Muehlhauser:

[Jan. 13th, 2012]

Ben, I'm glad you agreed to discuss artificial general intelligence (AGI) with me. There is much on which we agree, and much on which we disagree, so I think our dialogue will be informative to many readers, and to us!

Let us begin where we agree. We seem to agree that:

  1. Involuntary death is bad, and can be avoided with the right technology.
  2. Humans can be enhanced by merging with technology.
  3. Humans are on a risky course in general, because powerful technologies can destroy us, humans are often stupid, and we are unlikely to voluntarily halt technological progress.
  4. AGI is likely this century.
  5. AGI will, after a slow or hard takeoff, completely transform the world. It is a potential existential risk, but if done wisely, could be the best thing that ever happens to us.
  6. Careful effort will be required to ensure that AGI results in good things for humanity.

Next: Where do we disagree?

Two people might agree about the laws of thought most likely to give us an accurate model of the world, but disagree about which conclusions those laws of thought point us toward. For example, two scientists may use the same scientific method but offer two different models that seem to explain the data.

Or, two people might disagree about the laws of thought most likely to give us accurate models of the world. If that's the case, it will be no surprise that we disagree about which conclusions to draw from the data. We are not shocked when scientists and theologians end up with different models of the world.

Unfortunately, I suspect you and I disagree at the more fundamental level — about which methods of reasoning to use when seeking an accurate model of the world.

I sometimes use the term "Technical Rationality" to name my methods of reasoning. Technical Rationality is drawn from two sources: (1) the laws of logic, probability theory, and decision theory, and (2) the cognitive science of how our haphazardly evolved brains fail to reason in accordance with the laws of logic, probability theory, and decision theory.

Ben, at one time you tweeted a William S. Burroughs quote: "Rational thought is a failed experiment and should be phased out." I don't know whether Burroughs meant by "rational thought" the specific thing I mean by "rational thought," or what exactly you meant to express with your tweet, but I suspect we have different views of how to reason successfully about the world.

I think I would understand your way of thinking about AGI better if I understand your way of thinking about everything. For example: do you have reason to reject the laws of logic, probability theory, and decision theory? Do you think we disagree about the basic findings of the cognitive science of humans? What are your positive recommendations for reasoning about the world?

Ben Goertzel:

[Jan 13th, 2012]

Firstly, I don’t agree with that Burroughs quote that "Rational thought is a failed experiment” -- I mostly just tweeted it because I thought it was funny! I’m not sure Burroughs agreed with his own quote either. He also liked to say that linguistic communication was a failed experiment, introduced by women to help them oppress men into social conformity. Yet he was a writer and loved language. He enjoyed being a provocateur.

However, I do think that some people overestimate the power and scope of rational thought. That is the truth at the core of Burroughs’ entertaining hyperbolic statement....

I should clarify that I’m a huge fan of logic, reason and science. Compared to the average human being, I’m practically obsessed with these things! I don’t care for superstition, nor for unthinking acceptance of what one is told; and I spent a lot of time staring at data of various sorts, trying to understand the underlying reality in a rational and scientific way. So I don’t want to be pigeonholed as some sort of anti-rationalist!

However, I do have serious doubts both about the power and scope of rational thought in general -- and much more profoundly, about the power and scope of what you call “technical rationality.”

First of all, about the limitations of rational thought broadly conceived -- what one might call “semi-formal rationality”, as opposed to “technical rationality.” Obviously this sort of rationality has brought us amazing things, like science and mathematics and technology. Hopefully it will allow us to defeat involuntary death and increase our IQs by orders of magnitude and discover new universes, and all sorts of great stuff. However, it does seem to have its limits.

It doesn’t deal well with consciousness -- studying consciousness using traditional scientific and rational tools has just led to a mess of confusion. It doesn’t deal well with ethics either, as the current big mess regarding bioethics indicates.

And this is more speculative, but I tend to think it doesn’t deal that well with the spectrum of “anomalous phenomena” -- precognition, extrasensory perception, remote viewing, and so forth. I strongly suspect these phenomena exist, and that they can be understood to a significant extent via science -- but also that science as presently constituted may not be able to grasp them fully, due to issues like the mindset of the experimenter helping mold the results of the experiment.

There’s the minor issue of Hume’s problem of induction, as well. I.e., the issue that, in the rational and scientific world-view, that we have no rational reason to believe that any patterns observed in the past will continue into the future. This is an ASSUMPTION, plain and simple -- an act of faith. Occam’s Razor (which is one way of justifying and/or further specifying the belief that patterns observed in the past will continue into the future) is also an assumption and an act of faith. Science and reason rely on such acts of faith, yet provide no way to justify them. A big gap.

Furthermore -- and more to the point about AI -- I think there’s a limitation to the way we now model intelligence, which ties in with the limitations of the current scientific and rational approach. I have always advocated a view of intelligence as “achieving complex goals in complex environments”, and many others have formulated and advocated similar views. The basic idea here is that, for a system to be intelligent it doesn’t matter WHAT its goal is, so long as its goal is complex and it manages to achieve it. So the goal might be, say, reshaping every molecule in the universe into an image of Mickey Mouse. This way of thinking about intelligence, in which the goal is strictly separated from the methods for achieving it, is very useful and I’m using it to guide my own practical AGI work.

On the other hand, there’s also a sense in which reshaping every molecule in the universe into an image of Mickey Mouse is a STUPID goal. It’s somehow out of harmony with the Cosmos -- at least that’s my intuitive feeling. I’d like to interpret intelligence in some way that accounts for the intuitively apparent differential stupidity of different goals. In other words, I’d like to be able to deal more sensibly with the interaction of scientific and normative knowledge. This ties in with the incapacity of science and reason in their current forms to deal with ethics effectively, which I mentioned a moment ago.

I certainly don’t have all the answers here -- I’m just pointing out the complex of interconnected reasons why I think contemporary science and rationality are limited in power and scope, and are going to be replaced by something richer and better as the growth of our individual and collective minds progresses. What will this new, better thing be? I’m not sure -- but I have an inkling it will involve an integration of “third person” science/rationality with some sort of systematic approach to first-person and second-person experience.

Next, about “technical rationality” -- of course that’s a whole other can of worms. Semi-formal rationality has a great track record; it’s brought us science and math and technology, for example. So even if it has some limitations, we certainly owe it some respect! Technical rationality has no such track record, and so my semi-formal scientific and rational nature impels me to be highly skeptical of it! I have no reason to believe, at present, that focusing on technical rationality (as opposed to the many other ways to focus our attention, given our limited time and processing power) will generally make people more intelligent or better at achieving their goals. Maybe it will, in some contexts -- but what those contexts are, is something we don’t yet understand very well.

I provided consulting once to a project aimed at using computational neuroscience to understand the neurobiological causes of cognitive biases in people employed to analyze certain sorts of data. This is interesting to me; and it’s clear to me that in this context, minimization of some of these textbook cognitive biases would help these analysts to do their jobs better. I’m not sure how big an effect the reduction of these biases would have on their effectiveness, though, relative to other changes one might make, such as changes to their workplace culture or communication style.

On a mathematical basis, the justification for positing probability theory as the “correct” way to do reasoning under uncertainty relies on arguments like Cox’s axioms, or de Finetti’s Dutch Book arguments. These are beautiful pieces of math, but when you talk about applying them to the real world, you run into a lot of problems regarding the inapplicability of their assumptions. For instance, Cox’s axioms include an axiom specifying that (roughly speaking) multiple pathways of arriving at the same conclusion must lead to the same estimate of that conclusion’s truth value. This sounds sensible but in practice it’s only going to be achievable by minds with arbitrarily much computing capability at their disposal. In short, the assumptions underlying Cox’s axioms, de Finetti’s arguments, or any of the other arguments in favor of probability theory as the correct way of reasoning under uncertainty, do NOT apply to real-world intelligences operating under strictly bounded computational resources. They’re irrelevant to reality, except as inspirations to individuals of a certain cast of mind.

(An aside is that my own approach to AGI does heavily involve probability theory -- using a system I invented called Probabilistic Logic Networks, which integrates probability and logic in a unique way. I like probabilistic reasoning. I just don’t venerate it as uniquely powerful and important. In my OpenCog AGI architecture, it’s integrated with a bunch of other AI methods, which all have their own strengths and weaknesses.)

So anyway -- there’s no formal mathematical reason to think that “technical rationality” is a good approach in real-world situations; and “technical rationality” has no practical track record to speak of. And ordinary, semi-formal rationality itself seems to have some serious limitations of power and scope.

So what’s my conclusion? Semi-formal rationality is fantastic and important and we should use it and develop it -- but also be open to the possibility of its obsolescence as we discover broader and more incisive ways of understanding the universe (and this is probably moderately close to what William Burroughs really thought). Technical rationality is interesting and well worth exploring but we should still be pretty skeptical of its value, at this stage -- certainly, anyone who has supreme confidence that technical rationality is going to help humanity achieve its goals better, is being rather IRRATIONAL ;-) ….

In this vein, I’ve followed the emergence of the Less Wrong community with some amusement and interest. One ironic thing I’ve noticed about this community of people intensely concerned with improving their personal rationality is: by and large, these people are already hyper-developed in the area of rationality, but underdeveloped in other ways! Think about it -- who is the prototypical Less Wrong meetup participant? It’s a person who’s very rational already, relative to nearly all other humans -- but relatively lacking in other skills like intuitively and empathically understanding other people. But instead of focusing on improving their empathy and social intuition (things they really aren’t good at, relative to most humans), this person is focusing on fine-tuning their rationality more and more, via reprogramming their brains to more naturally use “technical rationality” tools! This seems a bit imbalanced. If you’re already a fairly rational person but lacking in other aspects of human development, the most rational thing may be NOT to focus on honing your “rationality fu” and better internalizing Bayes’ rule into your subconscious -- but rather on developing those other aspects of your being.... An analogy would be: If you’re very physically strong but can’t read well, and want to self-improve, what should you focus your time on? Weight-lifting or literacy? Even if greater strength is ultimately your main goal, one argument for focusing on literacy would be that you might read something that would eventually help you weight-lift better! Also you might avoid getting ripped off by a corrupt agent offering to help you with your bodybuilding career, due to being able to read your own legal contracts. Similarly, for people who are more developed in terms of rational inference than other aspects, the best way for them to become more rational might be for them to focus time on these other aspects (rather than on fine-tuning their rationality), because this may give them a deeper and broader perspective on rationality and what it really means.

Finally, you asked: “What are your positive recommendations for reasoning about the world?” I’m tempted to quote Nietzsche’s Zarathustra, who said “Go away from me and resist Zarathustra!” I tend to follow my own path, and generally encourage others to do the same. But I guess I can say a few more definite things beyond that....

To me it’s all about balance. My friend Allan Combs calls himself a “philosophical Taoist” sometimes; I like that line! Think for yourself; but also, try to genuinely listen to what others have to say. Reason incisively and analytically; but also be willing to listen to your heart, gut and intuition, even if the logical reasons for their promptings aren’t apparent. Think carefully through the details of things; but don’t be afraid to make wild intuitive leaps. Pay close mind to the relevant data and observe the world closely and particularly; but don’t forget that empirical data is in a sense a product of the mind, and facts only have meaning in some theoretical context. Don’t let your thoughts be clouded by your emotions; but don’t be a feeling-less automaton, don’t make judgments that are narrowly rational but fundamentally unwise. As Ben Franklin said, “Moderation in all things, including moderation.”

Luke:

[Jan 14th, 2012]

I whole-heartedly agree that there are plenty of Less Wrongers who, rationally, should spend less time studying rationality and more time practicing social skills and generic self-improvement methods! This is part of why I've written so many scientific self-help posts for Less Wrong: Scientific Self Help, How to Beat Procrastination, How to Be Happy, Rational Romantic Relationships, and others. It's also why I taught social skills classes at our two summer 2011 rationality camps.

Back to rationality. You talk about the "limitations" of "what one might call 'semi-formal rationality', as opposed to 'technical rationality.'" But I argued for technical rationality, so: what are the limitations of technical rationality? Does it, as you claim for "semi-formal rationality," fail to apply to consciousness or ethics or precognition? Does Bayes' Theorem remain true when looking at the evidence about awareness, but cease to be true when we look at the evidence concerning consciousness or precognition?

You talk about technical rationality's lack of a track record, but I don't know what you mean. Science was successful because it did a much better job of approximating perfect Bayesian probability theory than earlier methods did (e.g. faith, tradition), and science can be even more successful when it tries harder to approximate perfect Bayesian probability theory — see The Theory That Would Not Die.

You say that "minimization of some of these textbook cognitive biases would help [some] analysts to do their jobs better. I’m not sure how big an effect the reduction of these biases would have on their effectiveness, though, relative to other changes one might make, such as changes to their workplace culture or communication style." But this misunderstands what I mean by Technical Rationality. If teaching these people about cognitive biases would lower the expected value of some project, then technical rationality would recommend against teaching these people cognitive biases (at least, for the purposes of maximizing the expected value of that project). Your example here is a case of Straw Man Rationality. (But of course I didn't expect you to know everything I meant by Technical Rationality in advance! Though, I did provide a link to an explanation of what I meant by Technical Rationality in my first entry, above.)

The same goes for your dismissal of probability theory's foundations. You write that "In short, the assumptions underlying Cox’s axioms, de Finetti’s arguments, or any of the other arguments in favor of probability theory as the correct way of reasoning under uncertainty, do NOT apply to real-world intelligences operating under strictly bounded computational resources." Yes, we don't have infinite computing power. The point is that Bayesian probability theory is an ideal that can be approximated by finite beings. That's why science works better than faith — it's a better approximation of using probability theory to reason about the world, even though science is still a long way from a perfect use of probability theory.

Re: goals. Your view of intelligence as "achieving complex goals in complex environments" does, as you say, assume that "the goal is strictly separated from the methods for achieving it." I prefer a definition of intelligence as "efficient cross-domain optimization", but my view — like yours — also assumes that goals (what one values) are logically orthogonal to intelligence (one's ability to achieve what one values).

Nevertheless, you report an intuition that shaping every molecule into an image of Mickey Mouse is a "stupid" goal. But I don't know what you mean by this. A goal of shaping every molecule into an image of Mickey Mouse is an instrumentally intelligent goal if one's utility function will be maximized that way. Do you mean that it's a stupid goal according to your goals? But of course. This is, moreover, what we would expect your intuitive judgments to report, even if your intuitive judgments are irrelevant to the math of what would and wouldn't be an instrumentally intelligent goal for a different agent to have. The Mickey Mouse goal is "stupid" only by a definition of that term that is not the opposite of the explicit definitions either of us gave "intelligent," and it's important to keep that clear. And I certainly don't know what "out of harmony with the Cosmos" is supposed to mean.

Re: induction. I won't dive into that philosophical morass here. Suffice it to say that my views on the matter are expressed pretty well in Where Recursive Justification Hits Bottom, which is also a direct response to your view that science and reason are great but rely on "acts of faith."

Your final paragraph sounds like common sense, but it's too vague, as I think you would agree. One way to force a more precise answer to such questions is to think of how you'd program it into an AI. As Daniel Dennett said, "AI makes philosophy honest."

How would you program an AI to learn about reality, if you wanted it to have the most accurate model of reality possible? You'd have to be a bit more specific than "Think for yourself; but also, try to genuinely listen to what others have to say. Reason incisively and analytically; but also be willing to listen to your heart, gut and intuition…"

My own answer to the question of how I would program an AI to build as accurate a model of reality as possible is this: I would build it to use computable approximations of perfect technical rationality — that is, roughly: computable approximations of Solomonoff induction and Bayesian decision theory.

Ben:

[Jan 21st, 2012]

Bayes Theorem is “always true” in a formal sense, just like 1+1=2, obviously. However, the connection between formal mathematics and subjective experience, is not something that can be fully formalized.

Regarding consciousness, there are many questions, including what counts as “evidence.” In science we typically count something as evidence if the vast majority of the scientific community counts it as a real observation -- so ultimately the definition of “evidence” bottoms out in social agreement. But there’s a lot that’s unclear in this process of classifying an observation as evidence via a process of social agreement among multiple minds. This unclarity is mostly irrelevant to the study of trajectories of basketballs, but possibly quite relevant to study of consciousness.

Regarding psi, there are lots of questions, but one big problem is that it’s possible the presence and properties of a psi effect may depend on the broad context of the situation whether the effect takes place. Since we don’t know which aspects of the context are influencing the psi effect, we don’t know how to construct controlled experiments to measure psi. And we may not have the breadth of knowledge nor the processing power to reason about all the relevant context to a psi experiment, in a narrowly “technically rational” way.... I do suspect one can gather solid data demonstrating and exploring psi (and based on my current understanding, it seems this has already been done to a significant extent by the academic parapsychology community; see a few links I’ve gathered here), but I also suspect there many be aspects that elude the traditional scientific method, but are nonetheless perfectly real aspects of the universe.

Anyway both consciousness and psi are big, deep topics, and if we dig into them in detail, this interview will become longer than either of us has time for...

About the success of science -- I don’t really accept your Bayesian story for why science was successful. It’s naive for reasons much discussed by philosophers of science. My own take on the history and philosophy of science, from a few years back, is here (that article was the basis for a chapter in The Hidden Pattern, also). My goal in that essay was “a philosophical perspective that does justice to both the relativism and sociological embeddedness of science, and the objectivity and rationality of science.” It seems you focus overly much on the latter and ignore the former. That article tries to explain why probabilist explanations of real-world science are quite partial and miss a lot of the real story. But again, a long debate on the history of science would take us too far off track from the main thrust of this interview.

About technical rationality, cognitive biases, etc. -- I did read that blog entry that you linked, on technical rationality. Yes, it’s obvious that focusing on teaching an employee to be more rational, need not always be the most rational thing for an employer do, even if that employer has a purely rationalist world-view. For instance, if I want to train an attack dog, I may do better by focusing limited time and attention on increasing his strength rather than his rationality. My point was that there’s a kind of obsession with rationality in some parts of the intellectual community (e.g. some of the Less Wrong orbit) that I find a bit excessive and not always productive. But your reply impels me to distinguish two ways this excess may manifest itself:

  1. Excessive belief that rationality is the “right” way to solve problems and think about issues, in principle
  2. Excessive belief that, tactically, explicitly employing tools of technical rationality is a good way to solve problems in the real world

Psychologically I think these two excesses probably tend to go together, but they’re not logically coupled. In principle, someone could hold either one, but not the other.

This sort of ties in with your comments on science and faith. You view science as progress over faith -- and I agree if you interpret “faith” to mean “traditional religions.” But if you interpret “faith” more broadly, I don’t see a dichotomy there. Actually, I find the dichotomy between “science” and “faith” unfortunately phrased, since science itself ultimately relies on acts of faith also. The “problem of induction” can’t be solved, so every scientist must base his extrapolations from past into future based on some act of faith. It’s not a matter of science vs. faith, it’s a matter of what one chooses to place one’s faith in. I’d personally rather place faith in the idea that patterns observed in the past will likely continue into the future (as one example of a science-friendly article of faith), than in the word of some supposed “God” -- but I realize I’m still making an act of faith.

This ties in with the blog post “Where Recursive Justification Hits Bottom” that you pointed out. It’s pleasant reading but of course doesn’t provide any kind of rational argument against my views. In brief, according to my interpretation, it articulates a faith in the process of endless questioning:

The important thing is to hold nothing back in your criticisms of how to criticize; nor should you regard the unavoidability of loopy justifications as a warrant of immunity from questioning.

I share that faith, personally.

Regarding approximations to probabilistic reasoning under realistic conditions (of insufficient resources), the problem is that we lack rigorous knowledge about what they are. We don’t have any theorems telling us what is the best way to reason about uncertain knowledge, in the case that our computational resources are extremely restricted. You seem to be assuming that the best way is to explicitly use the rules of probability theory, but my point is that there is no mathematical or scientific foundation for this belief. You are making an act of faith in the doctrine of probability theory! You are assuming, because it feels intuitively and emotionally right to you, that even if the conditions of the arguments for the correctness of probabilistic reasoning are NOT met, then it still makes sense to use probability theory to reason about the world. But so far as I can tell, you don’t have a RATIONAL reason for this assumption, and certainly not a mathematical reason.

Re your response to my questioning the reduction of intelligence to goals and optimization -- I understand that you are intellectually committed to the perspective of intelligence in terms of optimization or goal-achievement or something similar to that. Your response to my doubts about this perspective basically just re-asserts your faith in the correctness and completeness of this sort of perspective. Your statement

The Mickey Mouse goal is "stupid" only by a definition of that term that is not the opposite of the explicit definitions either of us gave "intelligent," and it's important to keep that clear

basically asserts that it’s important to agree with your opinion on the ultimate meaning of intelligence!

On the contrary, I think it’s important to explore alternatives to the understanding of intelligence in terms of optimization or goal-achievement. That is something I’ve been thinking about a lot lately. However, I don’t have a really crisply-formulated alternative yet.

As a mathematician, I tend not to think there’s a “right” definition for anything. Rather, one explains one’s definitions, and then works with them and figures out their consequences. In my AI work, I’ve provisionally adopted a goal-achievemement based understanding of intelligence -- and have found this useful, to a significant extent. But I don’t think this is the true and ultimate way to understand intelligence. I think the view of intelligence in terms of goal-achievement or cross-domain optimization misses something, which future understandings of intelligence will encompass. I’ll venture that in 100 years the smartest beings on Earth will have a rigorous, detailed understanding of intelligence according to which

The Mickey Mouse goal is "stupid" only by a definition of that term that is not the opposite of the explicit definitions either of us gave "intelligent," and it's important to keep that clear

seems like rubbish.....

As for your professed inability to comprehend the notion of “harmony with the Cosmos” -- that’s unfortunate for you, but I guess trying to give you a sense for that notion, would take us way too far afield in this dialogue!

Finally, regarding your complaint that my indications regarding how to understanding the world are overly vague. Well -- according to Franklin’s idea of “Moderation in all things, including moderation”, one should also exercise moderation in precisiation. Not everything needs to be made completely precise and unambiguous (fortunately, since that’s not feasible anyway).

I don’t know how I would program an AI to build as accurate a model of reality as possible, if that were my goal. I’m not sure that’s the best goal for AI development, either. An accurate model in itself, doesn’t do anything helpful. My best stab in the direction of how I would ideally create an AI, if computational resource restrictions were no issue, is the GOLEM design that I described here. GOLEM is a design for a strongly self-modifying superintelligent AI system, which might plausibly have the possibility of retaining its initial goal system through successive self-modifications. However, it’s unclear to me whether it will ever be feasible to build.

You mention Solomonoff induction and Bayesian decision theory. But these are abstract mathematical constructs, and it’s unclear to me whether it will ever be feasible to build an AI system fundamentally founded on these ideas, and operating within feasible computational resources. Marcus Hutter and Juergen Schmidhuber and their students are making some efforts in this direction, and I admire those researchers and this body of work, but don’t currently have a high estimate of its odds of leading to any sort of powerful real-world AGI system.

Most of my thinking about AGI has gone into the more practical problem of how to make a human-level AGI

  1. using currently feasible computational resources
  2. that will most likely be helpful rather than harmful in terms of the things I value
  3. that will be smoothly extensible to intelligence beyond the human level as well.

For this purpose, I think Solomonoff induction and probability theory are useful, but aren’t all-powerful guiding principles. For instance, in the OpenCog AGI design (which is my main practical AGI-oriented venture at present), there is a component doing automated program learning of small programs -- and inside our program learning algorithm, we explicitly use an Occam bias, motivated by the theory of Solomonoff induction. And OpenCog also has a probabilistic reasoning engine, based on the math of Probabilistic Logic Networks (PLN). I don’t tend to favor the language of “Bayesianism”, but I would suppose PLN should be considered “Bayesian” since it uses probability theory (including Bayes rule) and doesn’t make a lot of arbitrary, a priori distributional assumptions. The truth value formulas inside PLN are based on an extension of imprecise probability theory, which in itself is an extension of standard Bayesian methods (looking at envelopes of prior distributions, rather than assuming specific priors).

In terms of how to get an OpenCog system to model the world effectively and choose its actions appropriately, I think teaching it and working together with it, will be be just as important as programming it. Right now the project is early-stage and the OpenCog design is maybe 50% implemented. But assuming the design is right, once the implementation is done, we’ll have a sort of idiot savant childlike mind, that will need to be educated in the ways of the world and humanity, and to learn about itself as well. So the general lessons of how to confront the world, that I cited above, would largely be imparted via interactive experiential learning, vaguely the same way that human kids learn to confront the world from their parents and teachers.

Drawing a few threads from this conversation together, it seems that

  1. I think technical rationality, and informal semi-rationality, are both useful tools for confronting life -- but not all-powerful
  2. I think Solomonoff induction and probability theory are both useful tools for constructing AGI systems -- but not all-powerful

whereas you seem to ascribe a more fundamental, foundational basis to these particular tools.

Luke:

[Jan. 21st, 2012]

To sum up, from my point of view:

  1. We seem to disagree on the applications of probability theory. For my part, I'll just point people to A Technical Explanation of Technical Explanation.
  2. I don't think we disagree much on the "sociological embeddedness" of science.
  3. I'm also not sure how much we really disagree about Solomonoff induction and Bayesian probability theory. I've already agreed that no machine will use these in practice because they are not computable — my point was about their provable optimality given infinite computation (subject to qualifications; see AIXI).

You've definitely misunderstood me concerning "intelligence." This part is definitely not true: "I understand that you are intellectually committed to the perspective of intelligence in terms of optimization or goal-achievement or something similar to that. Your response assumes the correctness and completeness of this sort of perspective." Intelligence as efficient cross-domain optimization is merely a stipulated definition. I'm happy to use other definitions of intelligence in conversation, so long as we're clear which definition we're using when we use the word. Or, we can replace the symbol with the substance and talk about "efficient cross-domain optimization" or "achieving complex goals in complex environments" without ever using the word "intelligence."

My point about the Mickey Mouse goal was that when you called the Mickey Mouse goal "stupid," this could be confusing, because "stupid" is usually the opposite of "intelligent," but your use of "stupid" in that sentence didn't seem to be the opposite of either definition of intelligence we each gave. So I'm still unsure what you mean by calling the Mickey Mouse goal "stupid."

This topic provides us with a handy transition away from philosophy of science and toward AGI. Suppose there was a machine with a vastly greater-than-human capacity for either "achieving complex goals in complex environments" or for "efficient cross-domain optimization." And suppose that machine's utility function would be maximized by reshaping every molecule into a Mickey Mouse shape. We can avoid the tricky word "stupid," here. The question is: Would that machine decide to change its utility function so that it doesn't continue to reshape every molecule into a Mickey Mouse shape? I think this is unlikely, for reasons discussed in Omohundro (2008).

I suppose a natural topic of conversation for us would be your October 2010 blog post The Singularity Institute's's Scary Idea (and Why I Don't Buy It). Does that post still reflect your views pretty well, Ben?

Ben:

[Mar 10th, 2012]

About the hypothetical uber-intelligence that wants to tile the cosmos with molecular Mickey Mouses -- I truly don’t feel confident making any assertions about a real-world system with vastly greater intelligence than me. There are just too many unknowns. Sure, according to certain models of the universe and intelligence that may seem sensible to some humans, it’s possible to argue that a hypothetical uber-intelligence like that would relentlessly proceed in tiling the cosmos with molecular Mickey Mouses. But so what? We don’t even know that such an uber-intelligence is even a possible thing -- in fact my intuition is that it’s not possible.

Why may it not be possible to create a very smart AI system that is strictly obsessed with that stupid goal? Consider first that it may not be possible to create a real-world, highly intelligent system that is strictly driven by explicit goals -- as opposed to being partially driven by implicit, “unconscious” (in the sense of deliberative, reflective consciousness) processes that operate in complex interaction with the world outside the system. Because pursuing explicit goals is quite computationally costly compared to many other sorts of intelligent processes. So if a real-world system is necessarily not wholly explicit-goal-driven, it may be that intelligent real-world systems will naturally drift away from certain goals and toward others. My strong intuition is that the goal of tiling the universe with molecular Mickey Mouses would fall into that category. However, I don’t yet have any rigorous argument to back this up. Unfortunately my time is limited, and while I generally have more fun theorizing and philosophizing than working on practical projects, I think it’s more important for me to push toward building AGI than just spend all my time on fun theory. (And then there’s the fact that I have to spend a lot of my time on applied narrow-AI projects to pay the mortgage and put my kids through college, etc.)

But anyway -- you don’t have any rigorous argument to back up the idea that a system like you posit is possible in the real-world, either! And SIAI has staff who, unlike me, are paid full-time to write and philosophize … and they haven’t come up with a rigorous argument in favor of the possibility of such a system, either. Although they have talked about it a lot, though usually in the context of paperclips rather than Mickey Mouses.

So, I’m not really sure how much value there is in this sort of thought-experiment about pathological AI systems that combine massively intelligent practical problem solving capability with incredibly stupid goals (goals that may not even be feasible for real-world superintelligences to adopt, due to their stupidity).

Regarding the concept of a “stupid goal” that I keep using, and that you question -- I admit I’m not quite sure how to formulate rigorously the idea that tiling the universe with Mickey Mouses is a stupid goal. This is something I’ve been thinking about a lot recently. But here’s a first rough stab in that direction: I think that if you created a highly intelligent system, allowed it to interact fairly flexibly with the universe, and also allowed it to modify its top-level goals in accordance with its experience, you’d be very unlikely to wind up with a system that had this goal (tiling the universe with Mickey Mouses). That goal is out of sync with the Cosmos, in the sense that an intelligent system that’s allowed to evolve itself in close coordination with the rest of the universe, is very unlikely to arrive at that goal system. I don’t claim this is a precise definition, but it should give you some indication of the direction I’m thinking in....

The tricky thing about this way of thinking about intelligence, which classifies some goals as “innately” stupider than others, is that it places intelligence not just in the system, but in the system’s broad relationship to the universe -- which is something that science, so far, has had a tougher time dealing with. It’s unclear to me which aspects of the mind and universe science, as we now conceive it, will be able to figure out. I look forward to understanding these aspects more fully....

About my blog post on “The Singularity Institute’s Scary Idea” -- yes, that still reflects my basic opinion. After I wrote that blog post, Michael Anissimov -- a long-time SIAI staffer and zealot whom I like and respect greatly -- told me he was going to write up and show me a systematic, rigorous argument as to why “an AGI not built based on a rigorous theory of Friendliness is almost certain to kill all humans” (the proposition I called “SIAI’s Scary Idea”). But he hasn’t followed through on that yet -- and neither has Eliezer or anyone associated with SIAI.

Just to be clear, I don’t really mind that SIAI folks hold that “Scary Idea” as an intuition. But I find it rather ironic when people make a great noise about their dedication to rationality, but then also make huge grand important statements about the future of humanity, with great confidence and oomph, that are not really backed up by any rational argumentation. This ironic behavior on the part of Eliezer, Michael Anissimov and other SIAI principals doesn’t really bother me, as I like and respect them and they are friendly to me, and we’ve simply “agreed to disagree” on these matters for the time being. But the reason I wrote that blog post is because my own blog posts about AGI were being trolled by SIAI zealots (not the principals, I hasten to note) leaving nasty comments to the effect of “SIAI has proved that if OpenCog achieves human level AGI, it will kill all humans.“ Not only has SIAI not proved any such thing, they have not even made a clear rational argument!

As Eliezer has pointed out to me several times in conversation, a clear rational argument doesn’t have to be mathematical. A clearly formulated argument in the manner of analytical philosophy, in favor of the Scary Idea, would certainly be very interesting. For example, philosopher David Chalmers recently wrote a carefully-argued philosophy paper arguing for the plausibility of a Singularity in the next couple hundred years. It’s somewhat dull reading, but it’s precise and rigorous in the manner of analytical philosophy, in a manner that Kurzweil’s writing (which is excellent in its own way) is not. An argument in favor of the Scary Idea, on the level of Chalmers’ paper on the Singularity, would be an excellent product for SIAI to produce. Of course a mathematical argument might be even better, but that may not be feasible to work on right now, given the state of mathematics today. And of course, mathematics can’t do everything -- there’s still the matter of connecting mathematics to everyday human experience, which analytical philosophy tries to handle, and mathematics by nature cannot.

My own suspicion, of course, is that in the process of trying to make a truly rigorous analytical philosophy style formulation of the argument for the Scary Idea, the SIAI folks will find huge holes in the argument. Or, maybe they already intuitively know the holes are there, which is why they have avoided presenting a rigorous write-up of the argument!!

Luke:

[Mar 11th, 2012]

I'll drop the stuff about Mickey Mouse so we can move on to AGI. Readers can come to their own conclusions on that.

Your main complaint seems to be that the Singularity Institute hasn't written up a clear, formal argument (in analytic philosophy's sense, if not the mathematical sense) in defense of our major positions — something like Chalmers' "The Singularity: A Philosophical Analysis" but more detailed.

I have the same complaint. I wish "The Singularity: A Philosophical Analysis" had been written 10 years ago, by Nick Bostrom and Eliezer Yudkowsky. It could have been written back then. Alas, we had to wait for Chalmers to speak at Singularity Summit 2009 and then write a paper based on his talk. And if it wasn't for Chalmers, I fear we'd still be waiting for such an article to exist. (Bostrom's forthcoming Superintelligence book should be good, though.)

I was hired by the Singularity Institute in September 2011 and have since then co-written two papers explaining some of the basics: "Intelligence Explosion: Evidence and Import" and "The Singularity and Machine Ethics". I also wrote the first ever outline of categories of open research problems in AI risk, cheekily titled "So You Want to Save the World". I'm developing other articles on "the basics" as quickly as I can. I would love to write more, but alas, I'm also busy being the Singularity Institute's Executive Director.

Perhaps we could reframe our discussion around the Singularity Institute's latest exposition of its basic ideas, "Intelligence Explosion: Evidence and Import"? Which claims in that paper do you most confidently disagree with, and why?

Ben:

[Mar 11th, 2012]

You say “Your main complaint seems to be that the Singularity Institute hasn't written up a clear, formal argument (in analytic philosophy's sense, if not the mathematical sense) in defense of our major positions “. Actually, my main complaint is that some of SIAI’s core positions seem almost certainly WRONG, and yet they haven’t written up a clear formal argument trying to justify these positions -- so it’s not possible to engage SIAI in rational discussion on their apparently wrong positions. Rather, when I try to engage SIAI folks about these wrong-looking positions (e.g. the “Scary Idea” I mentioned above), they tend to point me to Eliezer’s blog (“Less Wrong”) and tell me that if I studied it long and hard enough, I would find that the arguments in favor of SIAI’s positions are implicit there, just not clearly articulated in any one place. This is a bit frustrating to me -- SIAI is a fairly well-funded organization involving lots of smart people and explicitly devoted to rationality, so certainly it should have the capability to write up clear arguments for its core positions... if these arguments exist. My suspicion is that the Scary Idea, for example, is not backed up by any clear rational argument -- so the reason SIAI has not put forth any clear rational argument for it, is that they don’t really have one! Whereas Chalmers’ paper carefully formulated something that seemed obviously true...

Regarding the paper "Intelligence Explosion: Evidence and Import", I find its contents mainly agreeable -- and also somewhat unoriginal and unexciting, given the general context of 2012 Singularitarianism. The paper’s three core claims that

(1) there is a substantial chance we will create human-level AI before 2100, that (2) if human-level AI is created, there is a good chance vastly superhuman AI will follow via an "intelligence explosion," and that (3) an uncontrolled intelligence explosion could destroy everything we value, but a controlled intelligence explosion would benefit humanity enormously if we can achieve it.

are things that most “Singularitarians” would agree with. The paper doesn’t attempt to argue for the “Scary Idea” or Coherent Extrapolated Volition or the viability of creating some sort of provably Friendly AI, -- or any of the other positions that are specifically characteristic of SIAI. Rather, the paper advocates what one might call “plain vanilla Singularitarianism.” This may be a useful thing to do, though, since after all there are a lot of smart people out there who aren’t convinced of plain vanilla Singularitarianism.

I have a couple small quibbles with the paper, though. I don’t agree with Omohundro’s argument about the “basic AI drives” (though Steve is a friend and I greatly respect his intelligence and deep thinking). Steve’s argument for the inevitability of these drives in AIs is based on evolutionary ideas, and would seem to hold up in the case that there is a population of distinct AIs competing for resources -- but the argument seems to fall apart in the case of other possibilities like an AGI mindplex (a network of minds with less individuality than current human minds, yet not necessarily wholly blurred into a single mind -- rather, with reflective awareness and self-modeling at both the individual and group level).

Also, my “AI Nanny” concept is dismissed too quickly for my taste (though that doesn’t surprise me!). You suggest in this paper that to make an AI Nanny, it would likely be necessary to solve the problem of making an AI’s goal system persist under radical self-modification. But you don’t explain the reasoning underlying this suggestion (if indeed you have any). It seems to me -- as I say in my “AI Nanny” paper -- that one could probably make an AI Nanny with intelligence significantly beyond the human level, without having to make an AI architecture oriented toward radical self-modification. If you think this is false, it would be nice for you to explain why, rather than simply asserting your view. And your comment “Those of us working on AI safety theory would very much appreciate the extra time to solve the problems of AI safety...” carries the hint that I (as the author of the AI Nanny idea) am NOT working on AI safety theory. Yet my GOLEM design is a concrete design for a potentially Friendly AI (admittedly not computationally feasible using current resources), and in my view constitutes greater progress toward actual FAI than any of the publications of SIAI so far. (Of course, various SIAI associated folks often allude that there are great, unpublished discoveries about FAI hidden in the SIAI vaults -- a claim I somewhat doubt, but can’t wholly dismiss of course....)

Anyway, those quibbles aside, my main complaint about the paper you cite is that it sticks to “plain vanilla Singularitarianism” and avoids all of the radical, controversial positions that distinguish SIAI from myself, Ray Kurzweil, Vernor Vinge and the rest of the Singularitarian world. The crux of the matter, I suppose is the third main claim of the paper,

(3) an uncontrolled intelligence explosion could destroy everything we value, but a controlled intelligence explosion would benefit humanity enormously if we can achieve it.

This statement is hedged in such a way as to be almost obvious. But yet, what SIAI folks tend to tell me verbally and via email and blog comments is generally far more extreme than this bland and nearly obvious statement.

As an example, I recall when your co-author on that article, Anna Salamon, guest lectured in the class on Singularity Studies that my father and I were teaching at Rutgers University in 2010. Anna made the statement, to the students, that (I’m paraphrasing, though if you’re curious you can look up the online course session which was saved online and find her exact wording) “If a superhuman AGI is created without being carefully based on an explicit Friendliness theory, it is ALMOST SURE to destroy humanity.” (i.e., what I now call SIAI’s Scary Idea)

I then asked her (in the online class session) why she felt that way, and if she could give any argument to back up the idea.

She gave the familiar SIAI argument that, if one picks a mind at random from “mind space”, the odds that it will be Friendly to humans are effectively zero.

I made the familiar counter-argument that this is irrelevant, because nobody is advocating building a random mind. Rather, what some of us are suggesting is to build a mind with a Friendly-looking goal system, and a cognitive architecture that’s roughly human-like in nature but with a non-human-like propensity to choose its actions rationally based on its goals, and then raise this AGI mind in a caring way and integrate it into society. Arguments against the Friendliness of random minds are irrelevant as critiques of this sort of suggestion.

So, then she fell back instead on the familiar (paraphrasing again) “OK, but you must admit there’s a non-zero risk of such an AGI destroying humanity, so we should be very careful -- when the stakes are so high, better safe than sorry!”

I had pretty much the same exact argument with SIAI advocates Tom McCabe and Michael Anissimov on different occasions; and also, years before, with Eliezer Yudkowsky and Michael Vassar -- and before that, with (former SIAI Executive Director) Tyler Emerson. Over all these years, the SIAI community maintains the Scary Idea in its collective mind, and also maintains a great devotion to the idea of rationality, but yet fails to produce anything resembling a rational argument for the Scary Idea -- instead repetitiously trotting out irrelevant statements about random minds!!

What I would like is for SIAI to do one of these three things, publicly:

  1. Repudiate the Scary Idea
  2. Present a rigorous argument that the Scary Idea is true
  3. State that the Scary Idea is a commonly held intuition among the SIAI community, but admit that no rigorous rational argument exists for it at this point

Doing any one of these things would be intellectually honest. Presenting the Scary Idea as a confident conclusion, and then backing off when challenged into a platitudinous position equivalent to “there’s a non-zero risk … better safe than sorry...”, is not my idea of an intellectually honest way to do things.

Why does this particular point get on my nerves? Because I don’t like SIAI advocates telling people that I, personally, am on a R&D course where if I succeed I am almost certain to destroy humanity!!! That frustrates me. I don’t want to destroy humanity; and if someone gave me a rational argument that my work was most probably going to be destructive to humanity, I would stop doing the work and do something else with my time! But the fact that some other people have a non-rational intuition that my work, if successful, would be likely to destroy the world -- this doesn’t give me any urge to stop. I’m OK with the fact that some other people have this intuition -- but then I’d like them to make clear, when they state their views, that these views are based on intuition rather than rational argument. I will listen carefully to rational arguments that contravene my intuition -- but if it comes down to my intuition versus somebody else’s, in the end I’m likely to listen to my own, because I’m a fairly stubborn maverick kind of guy....

Luke:

[Mar 11th, 2012]

Ben, you write:

when I try to engage SIAI folks about these wrong-looking positions (e.g. the “Scary Idea” I mentioned above), they tend to point me to Eliezer’s blog (“Less Wrong”) and tell me that if I studied it long and hard enough, I would find that the arguments in favor of SIAI’s positions are implicit there, just not clearly articulated in any one place. This is a bit frustrating to me...

No kidding! It's very frustrating to me, too. That's one reason I'm working to clearly articulate the arguments in one place, starting with articles on the basics like "Intelligence Explosion: Evidence and Import."

I agree that "Intelligence Explosion: Evidence and Import" covers only the basics and does not argue for several positions associated uniquely with the Singularity Institute. It is, after all, the opening chapter of a book intelligence explosion, not the opening chapter of a book on the Singularity Institute's ideas!

I wanted to write that article first, though, so the Singularity Institute could be clear on the basics. For example, we needed to be clear that: (1) we are not Kurzweil, and our claims don't depend on his detailed storytelling or accelerating change curves, that (2) technological prediction is hard, and we are not being naively overconfident about AI timelines, and that (3) intelligence explosion is a convergent outcome of many paths the future may take. There is also much content that is not found in, for example, Chalmers' paper: (a) an overview of methods of technological prediction, (b) an overview of speed bumps and accelerators toward AI, (c) a reminder of breakthroughs like AIXI, and (d) a summary of AI advantages. (The rest is, as you say, mostly a brief overview of points that have been made elsewhere. But brief overviews are extremely useful!)

...my “AI Nanny” concept is dismissed too quickly for my taste...

No doubt! I think the idea is clearly worth exploring in several papers devoted to the topic.

It seems to me -- as I say in my “AI Nanny” paper -- that one could probably make an AI Nanny with intelligence significantly beyond the human level, without having to make an AI architecture oriented toward radical self-modification.

Whereas I tend to buy Omohundro's arguments that advanced AIs will want to self-improve just like humans want to self-improve, so that they become better able to achieve their final goals. Of course, we disagree on Omohundro's arguments — a topic to which I will return in a moment.

your comment "Those of us working on AI safety theory would very much appreciate the extra time to solve the problems of AI safety..." carries the hint that I (as the author of the AI Nanny idea) am NOT working on AI safety theory...

I didn't mean for it to carry that connotation. GOLEM and Nanny AI are both clearly AI safety ideas. I'll clarify that part before I submit a final draft to the editors.

Moving on: If you are indeed remembering your conversations with Anna, Michael, and others correctly, then again I sympathize with your frustration. I completely agree that it would be useful for the Singularity Institute to produce clear, formal arguments for the important positions it defends. In fact, just yesterday I was talking to Nick Beckstead about how badly both of us want to write these kinds of papers if we can find the time.

So, to respond to your wish that the Singularity Institute choose among three options, my plan is to (1) write up clear arguments for... well, if not "SIAI's Big Scary Idea" then for whatever I end up believing after going through the process of formalizing the arguments, and (2) publicly state (right now) that SIAI's Big Scary Idea is a commonly held view at the Singularity Institute but a clear, formal argument for it has never been published (at least, not to my satisfaction).

I don’t want to destroy humanity; and if someone gave me a rational argument that my work was most probably going to be destructive to humanity, I would stop doing the work and do something else with my time!

I'm glad to hear it! :)

Now, it seems a good point of traction is our disagreement over Omohundro's "Basic AI Drives." We could talk about that next, but for now I'd like to give you a moment to reply.

Ben:

[Mar 11th, 2012]

Yeah, I agree that your and Anna’s article is a good step for SIAI to take, albeit unexciting to a Singularitian insider type like me.... And I appreciate your genuinely rational response regarding the Scary Idea, thanks!

(And I note that I have also written some “unexciting to Singularitarians” material lately too, for similar reasons to those underlying your article -- e.g. an article on “Why an Intelligence Explosion is Probable” for a Springer volume on the Singularity.)

A quick comment on your statement that

we are not Kurzweil, and our claims don't depend on his detailed storytelling or accelerating change curves,

that’s a good point; but yet, any argument for a Singularity soon (e.g. likely this century, as you argue) ultimately depends on some argumentation analogous to Kurzweil’s, even if different in detail. I find Kurzweil’s detailed extrapolations a bit overconfident and more precise than the evidence warrants; but still, my basic reasons for thinking the Singularity is probably near are fairly similar to his -- and I think your reasons are fairly similar to his as well.

Anyway, sure, let’s go on to Omohundro’s posited Basic AI Drives -- which seem to me not to hold as necessary properties of future AIs unless the future of AI consists of a population of fairly distinct AIs competing for resources, which I intuitively doubt will be the situation.

 

 

[to be continued]

 

161 comments, sorted by
magical algorithm
Highlighting new comments since Today at 11:25 AM
Select new highlight date

This exchange significantly decreased my probability that Ben Goertzel is a careful thinker about AI problems. I think he has a good point about "rationalists" being too much invested in "rationality" (as opposed to rationality), but his AI thoughts are just seriously wtf. In tune with the Cosmos? Does this mean anything at all? I hate to say it based on a short conversation, but it looks like Ben Goertzel hasn't made any of his intuitions precise enough to even be wrong. And he makes the classic mistake of thinking "any intelligence" would avoid certain goal-types (i.e. 'fill the future light cone with some type of substance') because they're... stupid? I don't even...

Quoth Yvain:

If I asked you to prove that colorless green ideas do not sleep furiously, you wouldn't know where or how to begin.

He published a book called A Cosmist Manifesto which presumably describes some of his thoughts in more detail. It looked too new-age for me to take much interest.

Upvoted.

Goertzel's belief in AI FOOMs coupled with his beliefs in psi phenomena and the inherent stupidity of paperclipping made me lower my confidence in the likelihood of AI FOOMs slightly. Was this a reasonable operation, do you think?

It depends.

  • If you were previously aware of Goertzel's belief in AI FOOM but not his opinions on psi/paperclipping then you should lower your confidence slightly. (Exactly how much depends on what other evidence/opinions you have to hand).
  • If the SIAI were wheeling out Goertzel as an example of "look, here's someone who believes in FOOM" then it should lower your confidence
  • If you were previously unaware of Goertzel's belief in FOOM then it should probably increase your confidence very slightly. Reversed stupidity is not intelligence

Obviously the quanitity of "slightly" depends on what other evidence/opinions you have to hand.

This is a good analysis. I was previously weakly aware of Goertzel's beliefs on psi/paperclipping, and didn't know much about his opinions on AI other than that he was working on superhuman AGI but didn't have as much concern for Friendliness as SIAI. So I suppose my confidence shouldn't change very much either way. I'm still on the fence on several questions related to Singularitarianism, so I'm trying to get evidence wherever I can find it.

It's strange that people say the arguments for Big Scary Idea are not written anywhere. The argument seems to be simple and direct:

  1. Hard takeoff will make AI god-powerful very quickly.
  2. During hard takeoff, the AI's utility=goals=values=what-it-optimizes-for will solidify (when AI understand its own theory and self-modify correspondingly), and even if it was changeable before, it will be unchangeable forever since.
  3. Unless the AI goals embody every single value important for humans and are otherwise just right in every respect, the results of using god powers to optimize for these goals will be horrible.
  4. Human values are not a natural category, there's little to no chance that AI will converge on them by itself, unless specifically and precisely programmed.

The only really speculative step is step 1. But if you already believe in singularity and hard foom, then the argument should be unrefutable...

Arguments for step 2, e.g. the Omohundroan Ghandi folk theorem, are questionable. Step 3 isn't supported with impressive technical arguments anywhere I know of, step 4 isn't supported with impressive technical arguments anywhere I know of. Remember, there are a lot of moral realists out there who think of AIs as people who will sense and feel compelled by moral law. It's hard to make impressive technical arguments against that intuition. FOOM=doom and FOOM=yay folk can both point out a lot of facts about the world and draw analogies, but as far as impressive technical arguments go there's not much that can be done, largely because we have never built an AGI. It's a matter of moral philosophy, an inherently tricky subject.

I don't understand how Omohundroan Ghandi folk theorem is related to step 2. Could you elaborate? Step 2 looks obvious to me: assuming step 1, at some point the AI with imprecise and drifting utility would understand how to build a better AI with precise and fixed utility. Since building this better AI will maximize the current AI utility, the better AI will be built and its utility forever solidified.

As you say, steps 3 and 4 are currently hard to support with technical arguments, there are so many non-technical concepts involved. And it may be hard to argue intuitively with most people. But Goertzel is a programmer, he should know how programs behave :) Of course, he says his program will be intelligent, not stupid, and it is a good idea, as long as it is remembered that intelligent in this sense already means friendly, and friendliness does not follow from just being a powerful optimization process.

Also, thinking of AIs as people can only work up to the point where AI achieves complete self-understanding. This has never happened to humans.

But Goertzel is a programmer, he should know how programs behave :) Of course, he says his program will be intelligent, not stupid, and it is a good idea, as long as it is remembered that intelligent in this sense already means friendly, and friendliness does not follow from just being a powerful optimization process.

Hm, when I try to emulate Goertzel's perspective I think about it this way: if you look at brains, they seem to be a bunch of machine learning algorithms and domain-specific modules largely engineered to solve tricky game theory problems. Love isn't something that humans do despite game theory; love is game theory. And yet despite that it seems that brains end up doing lots of weird things like deciding to become a hermit or paint or compose or whatever. That's sort of weird; if you'd asked me what chimps would evolve into when they became generally intelligent, and I hadn't already seen humans or humanity, then I might've guessed that they'd evolve to develop efficient mating strategies, e.g. arranged marriage, and efficient forms of dominance contests, e.g. boxing with gloves, that don't look at all like the memetic heights of academia or the art scene. Much of academia is just social maneuvering, but the very smartest humans don't actually seem to be motivated by status displays; it seems that abstract memes have taken over the machine learning algorithms just by virtue of their being out there in Platospace, and that's actually pretty weird and perhaps unexpected.

So yes, Goertzel is a programmer and should know how programs behave, but human minds look like they're made of programs, and yet they ended up somewhat Friendly (or cosmically connected or whatever) despite that. Now the typical counter is AIXI: okay, maybe hacked-together machine learning algorithms will reliably stumble onto and adopt cosmic abstract concepts, but it sure doesn't look like AIXI would. Goertzel's counter to that is, of course, that AIXI is unproven, and that if you built an approximation of it then you'd have to use brain-like machine learning algorithms, which are liable to get distracted by abstract concepts. It might not be possible to get past the point where you're distracted by abstract concepts, and once they're in your mind (e.g. as problem representations, as subgoals, as whatever they are in human minds), you don't want to abandon them, even if you gain complete self-understanding. (There are various other paths that argument could take, but they all can plausibly lead to roughly the same place.)

I think that taking the soundness of such analogical arguments for granted would be incautious, and that's why I tend to promote the SingInst perspective around folk who aren't aware of it, but despite being pragmatically incautious they're not obviously epistemicly unsound, and I can easily see how someone could feel it was intuitively obvious that they were epistemicly sound. I think the biggest problem with that set of arguments is that they seem to unjustifiably discount the possibility of very small, very recursive seed AIs that can evolve to superintelligence very, very quickly; which are the same AIs that would get to superintelligence first in a race scenario. There are various reasons to be skeptical that such architectures will work, but even so it seems rather incautious to ignore them, and I feel like Goertzel is perhaps ignoring them, perhaps because he's not familiar with those kinds of AI architectures.

So yes, Goertzel is a programmer and should know how programs behave, but human minds look like they're made of programs, and yet they ended up somewhat Friendly (or cosmically connected or whatever) despite that.

That humans are only (as you flatteringly put it) "somewhat" friendly to human values is clearly an argument in favor of caution, is it not?

It is, but it's possible to argue somewhat convincingly that the lack of friendliness is in fact due to lack of intelligence. My favorite counterexample was Von Neumann, who didn't really seem to care much about anyone, but then I heard that he actually had somewhat complex political views but simplified them for consumption by the masses. On the whole it seems that intelligent folk really are significantly more moral than the majority of humanity, and this favors the "intelligence implies, or is the same thing as, cosmic goodness" perspective. This sort of argument is also very psychologically appealing to Enlightenment-influenced thinkers, i.e. most modern intellectuals, e.g. young Eliezer.

(Mildly buzzed, apologies for errors.)

(ETA: In case it isn't clear, I'm not arguing that such a perspective is a good one to adopt, I'm just trying to explain how one could feel justified in holding it as a default perspective and feel justified in being skeptical of intuitive non-technical arguments against it. I think constructing such explanations is necessary if one is to feel justified in disagreeing with one's opposition, for the same reason that you shouldn't make a move in chess until you've looked at what moves your opponent is likely to play in response, and then what move you could make in that case, and what moves they might make in response to that, and so on.)

On the whole it seems that intelligent folk really are significantly more moral than the majority of humanity, and this favors the "intelligence implies, or is the same thing as, cosmic goodness" perspective.

I think there are a number of reasons to be skeptical of the premise (and the implicit one about cosmic goodness being a coherent thing, but that's obviously covered territory.) Most people think their tribe seems more moral than others, so nerd impressions that nerds are particularly moral should be discounted. The people who are most interested in intellectual topics (i.e., the most obviously intelligent intelligent people) do often appear to be the least interested in worldly ambition/aggressive generally, but we would expect that just as a matter of preferences crowding each other out; worldly ambitious intelligent people seem to be among the most conspicuously amoral, even though you'd expect them to be the most well-equipped in means and motive to look otherwise. I recall Robin Hanson has referenced studies (which I'm too lazy to look up) that the intelligent lie and cheat more often; certainly this could be explained by an opportunity effect, but so could their presumedly lower levels of personal violence. Humans are friendlier than chimpanzees but less friendly than bonobos, and across the tree of life niceness and nastiness don't seem to have any relationship to computational power.

worldly ambitious intelligent people seem to be among the most conspicuously amoral

That's true and important, but stereotypical worldly intelligent people rarely "grave new values on new tables", and so might be much less intelligent than your Rousseaus and Hammurabis in the sense that they affect the cosmos less overall. Even worldly big shots like Stalin and Genghis rarely establish any significant ideological foothold. The memes use them like empty vessels.

But even so, the omnipresent you-claim-might-makes-right counterarguments remain uncontested. Hard to contest them.

Humans are friendlier than chimpanzees but less friendly than bonobos, and across the tree of life niceness and nastiness don't seem to have any relationship to computational power.

It's hard to tell how relevant this is; there's much discontinuity between chimps and humans and much variance among humans. (Although it's not that important, I'm skeptical of claims about bonobos; there were some premature sensationalist claims and then some counter-claims, and it all seemed annoyingly politicized.)

That's true and important, but stereotypical worldly intelligent people rarely "grave new values on new tables", and so might be much less intelligent than your Rousseaus and Hammurabis in the sense that they affect the cosmos less overall.

However, non-worldly intelligent people like Rousseau and Marx frequently give the new values that make people like Robespierre and Stalin possible.

In the public mind Rousseau and Marx and their intellectual progeny are generally seen as cosmically connected/intelligent/progressive, right? Maybe overzealous, but their hearts were in the right place. If so that would support the intelligence=goodness claim. If the Enlightenment is good by the lights of the public, then the uFAI-Antichrist is good by the lights of the public. [Removed section supporting this claim.] And who are we to disagree with the dead, the sheep and the shepherds?

(ETA: Contrarian terminology aside, the claim looks absurd without its supporting arguments... ugh.)

In the public mind Rousseau and Marx and their intellectual progeny are generally seen as cosmically connected/intelligent/progressive, right?

Depends on which subset of the public we're talking about.

Maybe overzealous, but their hearts were in the right place. If so that would support the intelligence=goodness claim.

I'm confused, is this an appeal to popular opinion?

If the Enlightenment is good by the lights of the public, then the uFAI-Antichrist is good by the lights of the public.

Of course. "And all that dwell upon the earth shall worship him [the beast/dragon]" Revelations 13:8

And who are we to disagree with the dead, the sheep and the shepherds?

People in a position to witness the practical results of their philosophy.

(ETA: Contrarian terminology aside, the claim looks absurd without its supporting arguments... ugh.)

Why exactly did you remove that section?

I would say that it is simply the case that many moral systems require intelligence, or are more effective with intelligence. The intelligence doesn't lead to morality per se, but does lead to ability to practically apply the morality. Furthermore, low intelligence usually implies lower tendency to cross-link the beliefs, resulting in less, hmm, morally coherent behaviour.

The people who are most interested in intellectual topics (i.e., the most obviously intelligent intelligent people) do often appear to be the least interested in worldly ambition/aggressive generally, but we would expect that just as a matter of preferences crowding each other out; worldly ambitious intelligent people seem to be among the most conspicuously amoral, even though you'd expect them to be the most well-equipped in means and motive to look otherwise.

Ouch, that hits a little close to home.

Fuck, wrote a response but lost it. The gist was, yeah, your points are valid, and the might-makes-right problems are pretty hard to get around even on the object level; I see an interesting way to defensibly move the goalposts, but the argument can't be discussed on LessWrong and I should think about it more carefully in any case.

On the whole it seems that intelligent folk really are significantly more moral than the majority of humanity

That's been my observation, also. But if it's true, I wonder why?

It could be because intelligence is useful for moral reasoning. Or it could be because intelligence is correlated with some temperamental, neurological, or personality traits that influence moral behavior. In the latter case, moral behavior would be a characteristic of the substrate of intelligent human minds.

So you're saying Goertzel believes that once any mind with sufficient intelligence and generally unfixed goals encounters certain abstract concepts, these concepts will hijack the cognitive architecture and rewrite its goals, with results equivalent for any reasonable initial mind design.

And the only evidence for this is that it happened once.

This does look a little obviously epistemically unsound.

So you're saying Goertzel believes

Just an off-the-cuff not-very-detailed hypothesis about what he believes.

with results equivalent for any reasonable initial mind design

Or at least any mind design that looks even vaguely person-like, e.g. uses clever Bayesian machine learning algorithms found by computational cognitive scientists; but I think Ben might be unknowingly ignoring certain architectures that are "reasonable" in a certain sense but do not look vaguely person-like.

And the only evidence for this is that it happened once.

Yes, but an embarrassingly naive application of Laplace's rule gives us a two-thirds probability it'll happen again.

This does look a little obviously epistemically unsound.

Eh, it looks pretty pragmatically incautious, but if you're forced to give a point estimate then it seems epistemicly justifiable. If it was taken to imply strong confidence then that would indeed be unsound.

(By the way, we seem to disagree re "epistemicly" versus "epistemically"; is "-icly" a rare or incorrect construction?)

vaguely person-like, e.g. uses clever Bayesian machine learning algorithms found by computational cognitive scientists

:)

Yes, but an embarrassingly naive application of Laplace's rule gives us a two-thirds probability it'll happen again.

:))

(By the way, we seem to disagree re "epistemicly" versus "epistemically"; is "-icly" a rare or incorrect construction?)

It sounds prosodically(sic!) awkward, although since English is not my mother tongue, my intuition is probably not worth much. But google appears to agree with me, 500000 vs 500 hits.

Goertzel expressed doubt about step 4, saying that while it's true that random AIs will have bad goals, he's not working on random AIs.

Well, if he believes his AI will be specifically and precisely programmed so as to converge on exactly the right goals before they are solidified in the hard takeoff, then he's working on a FAI. The remaining difference in opinions would be technical - about whether his AI will indeed converge, etc. It would not be about the Scary Idea itself.

I think it's taken by Goertzel as part of the Scary Idea that it's necessary to use several orders more precise understanding of AI's goals for its behavior not to be disastrous.

It's a direct logical consequence, isn't it? If one doesn't have a precise understanding of AI's goals, then whatever goals one imparts into AI won't be precise. And they must be precise, or (step3) => disaster.

He doesn't agree that they must be precise, so I guess step 3 is also out.

He can't think that god-powerfully optimizing for a forever-fixed not-precisely-correct goal would lead to anything but disaster. Not if he ever saw a non-human optimization process at work.

So he can only think precision is not important if he believes that
(1) human values are an attractor in the goal space, and any reasonably close goals would converge there before solidifying, and/or
(2) acceptable human values form a large convex region within the goal space, and optimizing for any point within this region is correct.

Without better understanding of AI goals, both can only be an article of faith...

From the conversation with Luke, he apparently accepts faith.

4) Human values are not a natural category, there's little to no chance that AI will converge on them by itself, unless specifically and precisely programmed.

Goertzel expressed doubt about step 4, saying that while it's true that random AIs will have bad goals, he's not working on random AIs.

That's not really the same as asserting that human values are a natural category.

I feel morally obligated to restate a potentially relevant observation:

I think that an important underlying difference of perspective here is that the Less Wrong memes tend to automatically think of all AGIs as essentially computer programs whereas Goertzel-like memes tend to automatically think of at least some AGIs as non-negligibly essentially person-like. I think this is at least partially because the Less Wrong memes want to write an FAI that is essentially some machine learning algorithms plus a universal prior on top of sound decision theory whereas the Goertzel-like memes want to write an FAI that is essentially roughly half progam-like and half person-like. Less Wrong memes think that person AIs won't be sufficiently person-like but they sort of tend to assume that conclusion rather than argue for it, which causes memes that aren't familiar with Less Wrong memes to wonder why Less Wrong memes are so incredibly confident that all AIs will necessarily act like autistic OCD people without any possibility at all of acting like normal reasonable people. From that perspective the Goertzel-like memes look justified in being rather skeptical of Less Wrong memes. After all, it is easy to imagine a gradation between AIXI and whole brain emulations. Goertzel-like memes wish to create an AI somewhere between those two points, Less Wrong memes wish to create an AI that's even more AIXI-like than AIXI is (in the sense of being more formally and theoretically well-founded than AIXI is). It's important that each look at the specific kinds of AI that the other has in mind and start the exchange from there.

We don't know if AIXI-approximating AIs would even be intelligent; how then can we be so confident that AIXI is a normative model and a definition of intelligence? This and other intuitions are likely underlying Goertzel's cautious epistemic state, and LessWrong/SingInst truly hasn't addressed issues like this. We don't know what it takes to build AGI, we don't know if intelligence runs on Bayes structure. Modern decision theory indicates that Eliezer was wrong, that Bayes structure isn't fundamental to agentic optimization, that it only applies in certain cases, that Bayesian information theoretic models of cognition might not capture the special sauce of intelligence. What is fundamental? We don't know! In the meantime we should be careful about drawing conclusions based on the assumed fundamental-ness of mathematical models which may or may not ultimately be accurate models, may or may not actually let you build literalistic self-improving AIs of the sort that LessWrong likes to speculate about.

I think your first paragraph was very useful.

I have no idea what your second paragraph is about -- "modern decision theory" is not a very specific citation. If there is research concluding that probability theory only applies to certain special cases of optimization, it would be awesome if you could make a top-level post explaining it to us!

There have already been many top-level posts, but you're right that I should have linked to them. Here is the LessWrong Wiki hub, here is a post by Wei Dai that cuts straight to the point.

Less Wrong memes think that person AIs won't be sufficiently person-like but they sort of tend to assume that conclusion rather than argue for it, which causes memes that aren't familiar with Less Wrong memes to wonder why Less Wrong memes are so incredibly confident that all AIs will necessarily act like autistic OCD people without any possibility at all of acting like normal reasonable people.

A whole lot of the sequences are dedicated to outlining just how reasonably normal people don't act. I would want any Strong AI in charge of our fates to be person-like in that it is aware of what humans want in a way that we would accept, because the alternative to that is probably disaster, but I wouldn't want one to be person-like in that its inductive biases are more like a human's than an ideal Bayesian reasoner's, or that it reasons about moral issues the way humans do intuitively, because our biases are often massively inappropriate, and our moral intuitions incoherent.

inductive biases are more like a human's than an ideal Bayesian reasoner's

Check out this post by Vladimir Nesov: "The problem of choosing Bayesian priors is in general the problem of formalizing preference, it can't be solved completely without considering utility, without formalizing values, and values are very complicated. No simple morality, no simple probability." Of course, having a human prior doesn't necessitate being human-like... Or does it? Duh duh duh.

Today I'd rather say that we don't know if "priors" is a fundamentally meaningful decision-theoretic idea, and so discussing what does or doesn't determine it would be premature.

(Anybody who thinks I'm missing something, ask yourself: what do you think you know that you think I don't think you know? How could I have come to not think you know something that you think you know? Are you confident of that model? This is where chess-playing subskills are very useful.)

Wow, I only associate that level of arrogance with Eliezer.

Sounds like a good thing to have in a "before hitting 'reply,' consider these" checklist; but not to add to your own comment (for, as Will might say, "game-theoretic and signaling reasons.")

I don't see how it's arrogance, except maybe by insinuation/connotation; I'll think about how to remove the insinuation/connotation. I was trying to describe an important skill of rationality, not assert my supremacy at that skill. But describing a skill sort of presupposes that the audience lacks the skill. So it's awkward.

It's arrogance because you're implying that you've already thought of and rejected any objection the reader could come up with.

Didn't mean to imply that; deleted the offending paragraph at any rate.