A Brief Overview of Machine Ethics

lukeprog

A Brief Overview of Machine Ethics — LessWrong

10 A Brief Overview of Machine Ethics

5th Mar 2011

1 min read

10

Earlier, I lamented that even though Eliezer named scholarship as one of the Twelve Virtues of Rationality, there is surprisingly little interest in (or citing of) the academic literature on some of Less Wrong's central discussion topics.

Previously, I provided an overview of formal epistemology, that field of philosophy that deals with (1) mathematically formalizing concepts related to induction, belief, choice, and action, and (2) arguing about the foundations of probability, statistics, game theory, decision theory, and algorithmic learning theory.

Now, I've written Machine Ethics is the Future, an introduction to machine ethics, the academic field that studies the problem of how to design artificial moral agents that act ethically (along with a few related problems). There, you will find PDFs of a dozen papers on the subject.

Enjoy!

Personal Blog

10

A Brief Overview of Machine Ethics

New Comment

91 comments, sorted by

top scoring

Click to highlight new comments since: Today at 2:54 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]Scott Alexander15y310

I started looking through some of the papers and so far I don't feel enlightened.

I've never been able to tell whether I don't understand Kantian ethics, or Kantian ethics is just stupid. Take Prospects For a Kantian Machine. The first part is about building a machine whose maxims satisfy the universalizability criterion: that they can be universalized without contradicting themselves.

But this seems to rely a lot on being very good at parsing categories in exactly the right way to come up with the answer you wanted originally.

For example, it seems reasonable to have maxims that only apply to certain portions of the population, for example: "I, who am a policeman, will lock up this bank robber awaiting trial in my county jail" generalizes to "Other policemen will also lock up bank robbers awaiting trial in their county jails" if you're a human moral philosopher who knows how these things are supposed to work.

But I don't see what's stopping a robot from coming up with "Everyone will lock up everyone else" or "All the world's policemen will descend upon this one bank robber and try to lock him up in their own county jails". After all, Kant univer... (read more)

[-]Scott Alexander15y190

Allen - Prolegomena to Any Future Moral Agent places a lot of emphasis on figuring out of a machine can be truly moral, in various metaphysical senses like "has the capacity to disobey the law, but doesn't" and "deliberates in a certain way". Not only is it possible that these are meaningless, but in a superintelligence the metaphysical implications should really take second-place to the not-getting-turned-into-paperclips implications.

He proposes a moral Turing Test, where we call a machine moral if it can answer moral questions indistinguishably from a human. But Clippy would also pass this test, if a consequence of passing was that the humans lowered their guard/let him out of the box. In fact, every unfriendly superintelligence with a basic knowledge of human culture and a motive would pass.

Utilitarianism considered difficult to implement because it's computationally impossible to predict all consequences. Given that any AI worth its salt would have a module for predicting the consequences of its actions anyway, and that the potential danger of the AI is directly related to how good this module is, that seems like a non-problem. It wouldn't be perfect, but it... (read more)

[-]Scott Alexander15y120

Mechanized Deontic Logic is pretty okay, despite the dread I had because of the name. I'm no good at formal systems, but as far as I can understand it looks like a logic for proving some simple results about morality: the example they give is "If you should see to it that X, then you should see to it that you should see to it that X."

I can't immediately see a way this would destroy the human race, but that's only because it's nowhere near the point where it involves what humans actually think of as "morality" yet.

[-]Scott Alexander15y160

Utilibot Project is about creating a personal care robot that will avoid accidentally killing its owner by representing the goal of "owner health" in a utilitarian way. It sounds like it might work for a robot with a very small list of potential actions (like "turn on stove" and "administer glucose") and a very specific list of owner health indicators (like "hunger" and "blood glucose level"), but it's not very relevant to the broader Friendly AI program.

Having read as many papers as I have time to before dinner, my provisional conclusion is that Vladimir Nesov hit the nail on the head

5lukeprog15y

I don't disagree with much of anything you've said here, by the way. Remember that I'm writing a book that, for most of its length, will systematically explain why the proposed solutions in the literature won't work. The problem is that SIAI is not even engaging in that discussion. Where is the detailed explanation of why these proposed solutions won't work? I don't get the impression someone like Yudkowsky has even read these papers, let alone explained why the proposed solutions won't work. SIAI is just talking a different language than the professional machine ethics community is. Most of the literature on machine ethics is not that useful, but that's true of almost any subject. The point of a literature hunt is to find the gems here and there that genuinely contribute to the important project of Friendly AI. Another points is to interact with the existing literature and explain to people why it's not going to be that easy.

9Vladimir_Nesov15y

My sentiment about the role of engaging existing literature on machine ethics is analogous to what you describe in a recent post on your blog. Particularly this: [...] You either push the boundaries, or fight the good fight. And the good fight is best fought by writing textbooks and opening schools, not by public debates with distinguished shamans. But it's not entirely fair, since some of machine ethics addresses a reasonable problem of making good-behaving robots, which just happens to have the same surface feature of considering moral valuation of decisions of artificial reasoners, but on closer inspection is mostly unrelated to the problem of FAI.

4lukeprog15y

Sure. One of the hopes of my book is, as stated earlier, to bring people up to where Eliezer Yudkowsky was circa 2004. Also, I worry that something is being overlooked by the LW / SIAI community because the response to suggestions in the literature has been so quick and dirty. I'm on the prowl for something that's been missed because nobody has done a thorough literature search and detailed rebuttal. We'll see what turns up.

2lukeprog15y

BTW, I so identify with this quote: [...] In fact, I've said the same thing myself, in slightly different words.

1AlephNeil15y

Every sufficiently smart person who thinks about Kantian ethics comes up with this objection. I don't believe it's possible to defend against it entirely. However... [...] That may be what Kant actually says (does he?) but if he does then I think he's wrong about his own theory. As I understand it, what you're supposed to do is look at the bit of reasoning which is actually causing you to want to do X and see whether that generalizes, not cast around for a bit of reasoning which would (or in this case, would not) generalize, and then pretend to be basing your action on that. In the example you mention, you should only generalize to "everyone will deceive everyone all the time" if what you're considering doing is deceiving this person simply because he's a person. If you want to deceive him because of his intention to commit murder, and would not want to otherwise, then the thing you generalize must have this feature. Similarly, I might try to justify lying to someone this morning on the basis that it generalizes to "I, who am AlephNeil, always lies on the morning of 13th day of March 2011 if it is to my advantage" which is both consistent and advantageous (to me). But really I would be lying purely because it's to my advantage - the date and time, and the fact that I am AlephNeil, don't enter into the computation.

0lukeprog15y

For Googleability, I'll not that this objection is called the problem of maxim specification.

0Document15y

That currently has no Google results besides your post.

1lukeprog15y

Yes, sorry. "Maxim specification" won't give you much, but variations on that will. People don't usually write "the problem of maxim specification" but instead things like "...specifying the maxim..." or "the maxim... specified..." and so on. It in general isn't easily Googled like "is-ought gap" is. But here is one use.

[-]Wei Dai15y150

Earlier, I lamented that even though Eliezer named scholarship as one of the Twelve Virtues of Rationality, there is surprisingly little interest in (or citing of) the academic literature on some of Less Wrong's central discussion topics.

Eliezer defined the virtue of scholarship as (a) "Study many sciences and absorb their power as your own." He was silent on whether, after you survey a literature and conclude that nobody has the right approach yet, you should (b) still cite the literature (presumably to show that you're familiar with it), and/or (c) rebut the wrong approaches (presumably to try to lead others away from the wrong paths).

I'd say that (b) and (c) are much more situational than (a). (b) is mostly a signaling issue. If you can convince your audience to take you seriously without doing it, then why bother? And (c) depends on how much effort you'd have to spend to convince others that they are wrong, and how likely they are to contribute to the correct solution after you turn them around. Or perhaps you're not sure that your approach is right either, and think it should just be explored alongside others.

At least some of the lack of scholarship that you see h... (read more)

1lukeprog15y

This is an excellent comment, and you're probably right to some degree. But I will say, I've learned many things already from the machine ethics literature, and I've only read about 1/4 of it so far.

1Vladimir_Nesov15y

Such as?

0lukeprog15y

Hold, please. I'm writing several articles and a book on this. :)

1lukeprog15y

But for now, this was Louie Helm's favorite paper among those we read during our survey of the literature on machine ethics.

1Pavitra15y

Citing the literature makes it easier for your reader to verify your reasoning. If you don't, then a proper confirmation or rebuttal requires (more) independent scholarship to discover the relevant existing literature from scratch.

[-]XiXiDu15y80

...there is surprisingly little interest in (or citing of) the academic literature on some of Less Wrong's central discussion topics.

I think one of the reasons is that the LW/SIAI crowd thinks all other people are below their standards. For example:

I tried - once - going to an interesting-sounding mainstream AI conference that happened to be in my area. I met ordinary research scholars and looked at their posterboards and read some of their papers. I watched their presentations and talked to them at lunch. And they were way below the level of the b

... (read more)

[-]Vladimir_Nesov15y210

I think one of the reasons is that the LW/SIAI crowd thinks all other people are below their standards.

"Below their standards" is a bad way to describe this situation, it suggests some kind of presumption of social superiority, while the actual problem is just that the things almost all researchers write presumably on this topic are not helpful. They are either considering a different problem (e.g. practical ways of making real near-future robots not kill wrong people, where it's perfectly reasonable to say that philosophy of consequentialism is useless, since there is no practical way to apply it; or applied ethics, where we ask how humans should act), or contemplate the confusingness of the problem, without making useful progress (a lot of philosophy).

This property doesn't depend on whether we are making progress ourselves, so it's perfectly possible (and to a large extent true) that progress that is up to the standard of being useful is not made by SIAI either.

A point where SIAI makes visible and useful progress is in communicating the difficulty of the problem, the very fact that most of what is purportedly progress on FAI is actually not.

3lukeprog15y

This is, in fact, the main goal of my book on the subject. Except, I'll do it in more detail, and spend more time citing the specific examples from the literature that are wrong. Eliezer has done some of this, but there's lots more to do.

5benelliott15y

Your definition of 'LW/SIAI crowd' appears to be 'Eliezer Yudkowsky'.

-4XiXiDu15y

My current perception is that there are not many independent minds to be found here. I perceive there to be a strong tendency to jump if Yudkowsky tells people to jump. I'm virtually the only true critic of the SIAI, which is really sad and frightening. There are many examples that show how people just 'trust' him or believe into him and I haven't been able to figure out good reasons to do so. ETA I removed the links to various 'examples' of what I have written above. Please PM me if you are curious.

6benelliott15y

Your karma balance should be enough to prove that you definitely aren't the only critic on LW. Others who also disagree with him about various things have even higher balances. There are definitely a number of true fanboys on this site, they may even be the majority (although I hope not), but they certainly aren't the whole of the LW crowd, and it is intellectually dishonest to put words in the rest of our mouths just by quoting Eliezer. As for SIAI, by its very purpose only attracts people who agree with Eliezer's philosophy of AI. There is nothing wrong with this. There is no good reason for someone who doesn't believe in the necessity or possibility of FAI to go work there. Would you also object if it seemed like everyone working for Village Reach agreed about giving vaccinations to African children being a good idea?

6XiXiDu15y

See, that one person who donated the current balance of his bank account got 52 upvotes for it. Now I'm not particularly shocked by him doing that or the upvotes. I don't worry that all that money might be better spend somehow. What drives me is curiosity mixed with my personality, I want to do what's right. That is the reason for why I criticize and why some comments may seem, or actually are derogatory. I think it needs to be said, I believe I can provoke feedback that way and learn more about the underlying rational. I desperately try to figure out if there is something I am missing. I haven't read most of the sequences yet, let me explain why. I'm a really slow reader, I have almost no education and need a lot of time to learn anything. I did a lot of spot tests, reading various posts and came across people who read the sequences but haven't been able to conclude that they should stop doing anything except trying to earn money for the SIAI. My conclusion is that reading the sequences shouldn't be a priority right now but rather learning the mathematical basics, programming and reading various books. But I still try to spend some time here to see if that assessment might be wrong. My current take on the whole issue is that the sequences do not provide much useful insights. I already know that by all that we know today AGI is possible and that it is unlikely that humans are the absolute limit when it comes to intelligence. I intuitively agree with the notion that AGI in its abstract form (intelligence as an algorithm) doesn't share our values if you do not deliberately 'tell' it to care. I see that one can outweigh even a low probability of risks from AI by assuming a future galactic civilization that is at stake. So what is my problem? I've written hundreds of comments about all kinds of problems I have with it, but maybe the biggest problem is a simple bias. I have an overwhelming gut feeling telling me that something is wrong with all this. I also do not trus

9Zack_M_Davis15y

It's worth noting that AGI is decades away; no one's trying to take over the universe just yet. In this light, donations to SingInst now are better seen as funding preliminary research and outreach regarding this important problem, rather than funding AI construction. [...] What sort of data and progress reports are you looking for? Glancing at the first two pages of the SingInst blog, I see a list of 2010 publications, and newsletters for last July and October. There's certainly room for criticism (e.g., "Why no newsletter since last October?" or "All this outreach is not very useful; I want to see incremental progress towards FAI"), but I wouldn't say there've been no progress reports.

7XiXiDu15y

* What are they working on right now? * Why are they working on it? * What constitutes a success of the current project? * How much money was spend on that project? * What could be done with more or less money? As far as I know Yudkowsky is currently writing a book. He earnt $95,550 last year. What I can't reconcile right now is the strong commitment and what is actually being done. Quite a few people here actually seem to donate considerable amounts of their income to the SIAI. No doubt writing the sequences, a book and creating an online community is pretty cool but does not seem to be too cost intensive. At least others manage to do that without lots of people sending them money. I myself donated 3 times. But why are many people acting like the SIAI is the only charity that currently deserves funding, why is nobody asking if they actually need more money or if they are maybe sustainable right now? I haven't heard anything about the acquisition of a supercomputer, field experiments in neuroscience or the hiring of mathematicians. All that would justify further donations. I feel people here are not critical and demanding enough.

5Nissa Seru15y

Upvoting for honesty and posting a true rejection. [...] Even if you're a slow reader, I think that it is very, very worth it to read most of the sequences. I've not read QM, Evolution, Decision Theory, and parts of Metaethics/ Human's guide to words, but I think that reading the others has drastically increased my rationality (especially the Core Sequences.) I don't think that reading technical books would have done so nearly as much because I find reading prose much more engaging than math. [...] I've recently concluded that I should place a 'highly suspect' marker on my thoughts (especially negative generalizations) if I am very hungry or tired. I tend to be quite irritable in both cases -- I'll get into arguments in which I'm really not interested in finding truth, but just getting a high from bashing the other person into the ground (please note that I am sharing my own experiences, not accusing you of this.) You may want to type these comments out so that you don't lose the thought but wait to post them until you're feeling better. [...] I've had these same thoughts before and since resolved them, but I've run out of mental steam and need to do some schoolwork. I may edit this or make a separate reply to this later. Edit: Bolded script in this post was added for clarification -- bolding does not indicate emphasis here.

3benelliott15y

Interesting thought, I'll admit hadn't actually considered that (I have a general problem with being too trusting and not seeing ulterior motives, although I suspect most people really aren't very dishonest). I can see a few reasons why others might not be asking: 1) Its unlikely to get an answer. There hasn't been a whole lot of willingness to respond to similar requests in the past, EY has a thing about not giving in to demands. This doesn't really explain why people are still donating. 2) The number of genuine Dr Evils in the world is very small. Historically the most dangerous individuals have been the well-intentioned but deluded rather than the rationally malicious, which is odd since the latter category seem much more dangerous and therefore provides evidence of their rarity. Maybe people are just making an expected utility calculation and determining that the Dr Evil hypothesis is unlikely enough to trust SIAI anyway. 3) Eliezer is not the whole of SIAI, he is not even in charge. Some of the people involved have existing track records, if there is a conspiracy it runs very deep. I suppose its possible he has tricked every other member of the organization, but we are now adding a lot of burdensome details to what was already a fairly unlikely hypothesis. 4) If there are any real Dr Evils out there, then SIAI transparency might actually help them by giving away SIAI ideas while Dr Evil keeps his ideas to himself and as a result finishes his design first. 5) If I was Dr Evil trying to build an AI, then I wouldn't say that was what I was doing, since AI is quite a hard sell and will only get donations from a limited demographic (even more so for an out-of-the-mainstream idea like FAI). I would found the "organization for the protection of puppies kittens and bunnies" or something like that, which will probably get more donation money (or maybe even go into business rather than charity, since current evidence suggests that is overwhelmingly the most effecti

2wedrifid15y

Yes, it is impossible to distinguish a sincere optimist from a perfectly selfish sociopath. At least until they gain power (or move to an audience where the signalling game is played at a higher level of sophistication than that of conveying altruism).

0Vladimir_Nesov15y

In that case, I would expect a stupid Eliezer Yudkowsky. But one shouldn't actually reason this way, the question is, what do you anticipate, given observations actually made; not how plausible are the observations actually made, given an uncaused hypothesis.

3Pavitra15y

You can't compute P(H|E) without computing P(E|H).

2Vladimir_Nesov15y

But one shouldn't confuse the two.

0[anonymous]15y

What's an uncaused hypothesis? And didn't you just accidentally forbid people to think properly?

0XiXiDu15y

Why is evil stupid and what evidence is there that Yudkowsky is smart enough not to be evil? [...] If you got someone working on friendly AI you better ask if the person is friendly in the first place. You also shouldn't make conclusions based on the output of the subject of your conclusions. If Yudkowsky states what is right and states that he will do what is right that provides no evidence about the rightness and honesty of those statements. Besides, the most advanced statements about Yudkowsky's intentions are CEV and the meta-ethics sequence. Both are either criticized or not understood. The question should be, what is the worst-case scenario regarding Yudkowsky and the SIAI and how can we discern it from what he is signaling? If the answer isn't clear, one should ask for transparency and oversight.

[-]Quirinus_Quirrell15y190

You seem to be under the impression that Eliezer is going to create an artificial general intelligence, and oversight is necessary to ensure that he doesn't create one which places his goals over humanity's interests. It is important, you say, that he is not allowed unchecked power. This is all fine, except for one very important fact that you've missed.

Eliezer Yudkowsky can't program. He's never published a nontrivial piece of software, and doesn't spend time coding. In the one way that matters, he's a muggle. Ineligible to write an AI. Eliezer has not positioned himself to be the hero, the one who writes the AI or implements its utility function. The hero, if there is to be one, has not yet appeared on stage. No, Eliezer has positioned himself to be the mysterious old wizard - to lay out a path, and let someone else follow it. You want there to be oversight over Eliezer, and Eliezer wants to be the oversight over someone else to be determined.

But maybe we shouldn't trust Eliezer to be the mysterious old wizard, either. If the hero/AI programmer comes to him with a seed AI, then he knows it exists, and finding out that a seed AI exists before it launches is the hardest part of any... (read more)

7Eliezer Yudkowsky13y

(For the record: I've programmed in C++, Python, Java, wrote some BASIC programs on a ZX80 when I was 5 or 6, and once very briefly when MacOS System 6 required it I wrote several lines of a program in 68K assembly. I admit I haven't done much coding recently, due to other comparative advantages beating that one out.)

0topynate13y

I can't find it by search, but haven't you stated that you've written hundreds of KLOC?

2BT_Uytya13y

Yep, he have.

0Eliezer Yudkowsky13y

Sounds about right. It wasn't good code, I was young and working alone. Though it's more like the code was strategically stupid than locally poorly written.

1XiXiDu15y

I disagree based on the following evidence: [...] You further write: [...] I'm not aware of any reason to believe that recursively self-improving artificial general intelligence is going to be something you can 'run away with'. It looks like some people here think so, that there will be some kind of, with hindsight, simple algorithm for intelligence that people can just run and get superhuman intelligence. Indeed, transparency could be very dangerous in that case. But that doesn't mean it is an all or nothing decision. There are many other reasons for transparency, including reassurance and the ability to discern a trickster or impotent individual from someone who deserves more money. But as I said, I don't see that anyway. It'll more likely be a blue sheet of different achievements that are each not dangerous on their own. I further think it will be not just a software solution but also a conceptual and computational revolution. In those cases an open approach will allow public oversight. And even if someone is going to run with it, you want them to use your solution rather than one that will most certainly be unfriendly.

2Vladimir_Nesov15y

Evil is not necessarily stupid (well, it is, if we are talking about humans, but let's abstract from that). Still, it would take a stupid Dr Evil to decide that pretending to be Eliezer Yudkowsky is the best available course of action.

0timtyler15y

You don't think that being Eliezer Yudkowsky is an effective way to accomplish the task at hand? What should Dr Evil do, then? FWIW, my usual comparison is not with Dr Evil, but with Gollum. The Singularity Instutute have explicitly stated said they are trying to form "The Fellowship of the AI". Obviously we want to avoid Gollum's final scene. Gollum actually started out good - it was the exposure to the ring that caused problems later on.

0Leonhart15y

I seem to remember Smeagol being an unpleasant chap even before Deagol found the ring. But admittedly, we weren't given much.

-3timtyler15y

Transparency is listed as being desirable here: [...] However, apparently, this doesn't seem to mean open source software - e.g. here: [...]

2Vladimir_Nesov15y

You equivocate two unrelated senses of "transparency".

-4timtyler15y

Uh, what? Transparency gets listed as a "socially important" virtue in the PR documents - but the plans apparently involve keeping the source code secret.

4jimrandomh15y

He means "transparent" as in "you can read its plans in the log files/with a debugger", not as in "lots of people have access". Transparency in the former sense is a good thing, since it lets the programmer verify that it's sane and performing as expected. Transparency in the latter sense is a bad thing, because if lots of people had access then there would be no one with the power to say the AI wasn't safe to run or give extra hardware, since anyone could take a copy and run it themselves.

-6timtyler15y

-3Perplexed15y

You are confusing socially important with societally important. Microsoft, for example, seeks to have its source code transparent to inspection, because Microsoft, as a corporate culture, produces software socially - that is, utilizing an evil conspiracy involving many communicating agents.

0timtyler15y

I deny confusing anything. I understand that transparency can be a matter of degree and perspective. What I am pointing out is lip-service to transparency. Full transparency would be different. Microsoft's software is not very transparent - and partly as a result it is some of the most badly-designed, insecure and virus-ridden software the planet has ever seen. We can see the mistake, can see its consequences - and know how to avoid it - but we have to, like actually do that - and that involves some alerting of others to the problems often associated with closed-source proposals.

-8timtyler15y

-4XiXiDu15y

If I would disagree and believe that it is worth it to voice my disagreement, then yes. You just can't compare that though. Can you name another group of people who try to take over the universe? [...] Jehovah's Witnesses also only attract certain people. A lot of money is being donated and spend on brainwashing material designed to get even more money to spend on brainwashing. I think that is wrong. The problem is that nobody there is deliberately doing something 'wrong'. There is no guru, they all believe to do what is 'right'. Nobody is critical. But if they had a forum where one could openly discuss with them about their ideas then I'd be there and challenge them. Not that I want to compare them with LW, that be crazy, but I want to challenge your argument.

0benelliott15y

The Village Reach argument was referring to SIAI, not Less Wrong. They are distinct entities, one is a forum for discussion and the other is an organization with the aim of doing something. It is quite right that the first has many dissenting opinions, whereas the latter does not. SIAI may be able to benefit from dissent on the many sub-issues related to FAI, but not to the fundamental idea that FAI is important. Imagine a company where about 40% of the employees, even at the highest levels, disagreed with the premise that they should be trying to make money and instead either intentionally tried to lose the company money, or argued constantly with the other 60%. Nothing would get done. Disagreement about FAI may be good for LW but it is probably not good for SIAI. Since there is disagreement on LW, I really don't see the problem.

2Vladimir_Nesov15y

If FAI is unimportant, SIAI should conclude that FAI is unimportant. Hence it's not clear where the following distinction happens. [...]

2benelliott15y

I don't think its the best use of any organization's money to employ people who disagree with the premise that the organization should exist.

0Vladimir_Nesov15y

But disagreement itself is not the reason for this being a bad strategy.

4benelliott15y

I don't quite follow. The only point I was trying to make was that "everybody in SIAI agrees about FAI, therefore they're all a bunch of brainwashed zombies" is not a valid complaint.

0Vladimir_Nesov15y

Yes.

0Vladimir_Nesov15y

What argument? benelliott suggested that your argument makes use of a very weak piece of evidence (presence of significant agreement). Obviously, interpreted as counterevidence of the opposite claim, it is equally weak.

5Emile15y

Maybe it's because this "being an independant mind" thing isn't as great as you think it is? Like most people here, I've been raised hearing about the merits of challenging authority, thinking for yourself, questioning everything, not following the herd, etc. But there's a dark side to that, and it's thinking that when you disagree with the experts, you're right and the experts are wrong. I now think that a lot of those "think for yourself" and "listen to your heart" things are borderline dark-side epistemology, and that by default, the experts are right and I should just shut up until I have some very good reasons to disagree. Any darn fool can decide the experts are victim of groupthink, or don't dare think outside the box, or just want to preserve the status quo. I think changing one's mind when faced with disagreeing expert opinion is a better sign off rationality than "thinking for oneself". I think that many rationalist's self-image as iconoclasts is harmful. I'm willing to call myself an "Eliezer Yudkosky fanboy" in a bullet-biting kind of way. I don't see the lack of systematic disagreement as a bad thing, and I don't care about looking like a cult member.

0XiXiDu15y

Yet you decided to trust Yudkowsky, not the experts. [...] I don't, that is why I am asking experts, many seem not to share Yudkowsky's worries. [...] I actually got a link to his homepage and the SIAI on my homepage for a few years under 'favorites sites'.

3lukeprog15y

I doubt you're "virtually the only true critic of the SIAI." But if you think I'm not much of a critic of SIAI/Yudkowsky, you're right. Many of my posts have included minor criticisms, but that's because it's not as valued here to just repeat all the thousands of things on which I agree with Eliezer.

4XiXiDu15y

I actually messaged him telling him that he can edit/delete any harmful submissions of mine without having to expect harmful protest. Does that look like I particularly disagree with him, or assign a high probability to him being Dr. Evil? I don't, but it is a possibility and it is widely ignored. To get provable friendly AI you'll need provable friendly humans. If that isn't possible you'll need oversight and transparency. * Smart people can be wrong. * Smart people can be evil. * People can appear smarter than they are. That's why I demand... * Third-party peer-review of Yudkowsky's work. * Oversight and transparency. * Progress reports, roadmaps and confirmable success.

2wedrifid15y

Not actually true.

1XiXiDu15y

Technically it isn't of course. But I don't expect unfriendly humans not to show me friendly AI but actually implement something else. What I meant is that you'll need friendly humans to not end up with some trickster who takes your money and in 30 years you notice that all he has done is to code some chat bot. There are a lot of reasons that the trustworthiness of the humans involved is important. Of course, provable friendly AI is provable friendly no matter who coded it.

0wedrifid15y

I criticise Eliezer frequently. I manage to do so without being particularly negatively received by the alleged Yudkowsky hive mind. Note: My criticisms of EY/SIAI are specific even if consistent. Like lukeprog I do not feel the need to repeat the thousands of things about which I agree with EY. Further Note: There are enough distinct things that I disagree with Eliezer about that, given my metacognitive confidence levels I can expect that on at least one of them I am mistaken. Which is a curious epistemic state to be in but purely tangential. ;) Yet another edit: A clear example of criticism of Eliezer is with respect to his discussion of his metaethics and CEV. I didn't find his contribution in the linked conversation satisfactory and consider it representative of his other recent contributions on the subject. Everything except his sequence on the subject has been nowhere near the standard I would expect from someone dedicating their life to studying a subject that will rely reasoning flawlessly in the related area!

0XiXiDu15y

You think I don't? I agree with almost everyone about thousands of things. I perceive myself to be an uneducated fool. If I read a few posts of someone like Yudkowsky and intuitively agree, that is very weak evidence to trust him or of his superior intellect. I still think that he's one of the smartest people though. But there is a limit to what I'll just accept on mere reassurance. And I have seen nothing that would allow me to conclude that he could accomplish much regarding friendly AI without a billion dollars and a team of mathematicians and other specialists.

0wedrifid15y

No, that wasn't for your benefit at all. Just disclaiming limits. Declarations of criticism are sometimes worth tempering just a tad. :)

2David_Gerard15y

By the MWI sequence, I presume he means the QM sequence, which appears clear to me but bogus to physicists I've asked ... and, more importantly, to the physicists who commented on the posts in it and said that he couldn't do what he'd just done (To which he answered that he doesn't claim to be a physicist.) Also, judging by the low votes and small number of commenters, it seems that even people who claim to have read the sequences have tended to tl;dr at the QM sequence. (I finally finished a first run through the million words of sequences and the millions of words of comments. I only finally tipped my tl;dr tilt sensor at the decision theory sequence, which isn't actually very sequential.)

-1wedrifid15y

I love those quotes. The one about negatively useful AI doctorates is a favourite of mine. :)

-1Manfred15y

Huh, just read So You Want To Be A Seed AI Programmer. Appears to be from 2009. I would recommend http://www.fastcompany.com/magazine/06/writestuff.html as a highly contrasting frame of thought.

4komponisto15y

It's from much earlier than that (like 2005 or something). That particular wiki isn't the original source.

[-]DanB15y20

With regards to your (and Eliezer's) quest, I think Oppenheimer's Maxim is relevant:

It is a profound and necessary truth that the deep things in science are not found because they are useful, they are found because it was possible to find them.

A theory of machine ethics may very well be the most useful concept ever discovered by humanity. But as far as I can see, there is no reason to believe that such a theory can be found.

7lukeprog15y

Daniel_Burfoot, I share your pessimism. When superintelligence arrives, humanity is almost certainly fucked. But we can try.

[-]timtyler15y00

For the list:

The Ethics of Artificial Intelligence http://www.nickbostrom.com/ethics/artificial-intelligence.pdf

Ethical Issues in Advanced Artificial Intelligence http://www.nickbostrom.com/ethics/ai.html

Beyond AI http://mol-eng.com/

0lukeprog15y

Tim, I have hundreds of papers I could upload and put on the list. The list was just a preview. Thanks anyway.

[-]DanB15y00

Cloos, “The Utilibot Project: An Autonomous Mobile Robot Based on Utilitarianism"

!!!

Moderation Log