Why Politics are Important to Less Wrong...

The Friendliness Problem is, at its root, about forming a workable set of values which are acceptable to society.

No, that's the special bonus round after you solve the real friendliness problem. If that were the real deal, we could just tell an AI to enforce Biblical values or the values of Queen Elizabeth II or the US Constitution or something, and although the results would probably be unpleasant they would be no worse than the many unpleasant states that have existed throughout history.

As opposed to the current problem of having a very high likelihood that the AI will kill everyone in the world.

The Friendliness problem is, at its root, about communicating values to an AI and keeping those values stable. If we tell the AI "do whatever Queen Elizabeth II wants" - which I expect would be a perfectly acceptable society to live in - the Friendliness problem is how to get the AI to properly translate that into statements like "Queen Elizabeth wants a more peaceful world" and not things more like "INCREASE LEVEL OF DOPAMINE IN QUEEN ELIZABETH'S REWARD CENTER TO 3^^^3 MOLES" or "ERROR: QUEEN ELIZABETH NOT AN OBVIOUSLY CLOSED SYSTEM, CONVERT EVERYTH... (read more)

5Eugine_Nier13y

That depends on whether we mean 2013!Queen Elizabeth II or Queen Elizabeth after the the resulting power goes to her head.

0OrphanWilde13y

I don't think you get the same thing from that document that I do. (Incidentally, I disagree with a lot of the design decisions inherent in that document, such as self-modifying AI, which I regard as inherently and uncorrectably dangerous. When you stop expecting the AI to make itself better, the "Keep your ethics stable across iterations" part of the problem goes away.) Either that or I'm misunderstanding you. Because my current understanding of your view of the Friendliness problem has less to do with codifying and programming ethics and more to do with teaching the AI to know exactly what we mean and not to misinterpret what we ask for. (Which I hope you'll forgive me if I call "Magical thinking." That's not necessarily a disparagement; sufficiently advanced technology and all that. I just think it's not feasible in the foreseeable future, and such an AI makes a poor target for us as we exist today.)

[-]Zack_M_Davis13y300

I imagine a Friendly AI, I imagine a hands-off benefactor who permits people to do anything they wish to which won't result in harm to others.

Yeah, I like personal freedom, too, but you have to realize that this is massively, massively underspecified. What exactly constitutes "harm", and what specific mechanisms are in place to prevent it? Presumably a punch in the face is "harm"; what about an unexpected pat on the back? What about all other possible forms of physical contact that you don't know how to consider in advance? If loud verbal abuse is harm, what about polite criticism? What about all other possible ways of affecting someone via sound waves that you don't know how to consider in advance? &c., ad infinitum.

Does anybody envisage a Friendly AI which doesn't correspond more or less directly with their own political beliefs?

I'm starting to think this entire idea of "having political beliefs" is crazy. There are all sorts of possible forms of human social organization, which result in various outcomes for the humans involved; how am I supposed to know which one is best for people? From what I know about economics, I can point out some ... (read more)

[-]Viliam_Bur13y140

I'm starting to think this entire idea of "having political beliefs" is crazy.

Most of my "political beliefs" is awareness of specific failures in other people's beliefs.

2ikrase13y

That's fairly common, and rarely realized, I think.

1Viliam_Bur13y

Fairly common among rational (I don't mean LW-style) people. But I also know people who really believe things, and it's kind of scary.

9Vladimir_Nesov13y

These examples also only compare things with status quo. Status quo is most likely itself "harm" when compared to many of the alternatives.

7OrphanWilde13y

There are many more ways to arrange things in a defective manner than an effective one. I'd consider deviations from the status quo to be harmful until proven otherwise.

3torekp13y

Or in other words: most mutations are harmful.

1Vladimir_Nesov13y

(Fixed the wording to better match the intended meaning: "compared to the many alternatives" -> "compared to many of the alternatives".)

5RomeoStevens13y

All formulations of human value are massively underspecified. I agree that expecting humans to know what sorts of things would be good for humans in general is terrible. The problem is that we also can't get an honest report of what people think would be good for them personally because lying is too useful/humans value things hypocritically.

-5whowhowho13y

[-]Adele Lopez13y130

Which is where I think politics offers a pretty strong hint to the possibility that the Friendliness Problem has no resolution:

We can't agree on which political formations are more Friendly. That's what "Politics is the Mindkiller" is all about; our inability to come to an agreement on political matters. It's not merely a matter of the rules - which is to say, it's not a matter of the output: We can't even come to an agreement about which values should be used to form the rules.

I'm pretty sure this is a problem with human reasoning abilities, and not a problem with friendliness itself. Or in other words, I think this is only very weak evidence that friendliness is unresolvable.

3Ben Pace13y

Indeed. If we were perfect bayesians, who had unlimited introspective access, and we STILL couldn't agree after an unconscionable amount of argument and discussion, then we'd have a bigger problem.

5OrphanWilde13y

Are perfect Bayesians with unlimited introspective access more inclined to agree on matters of first principles? I'm not sure. I've never met one, much less two.

1Plasmon13y

yes

[-]Adele Lopez13y160

They will agree on what values they have, and what the best action is relative to those values, but they still might have different values.

1Ben Pace13y

My point exactly. Only if we are sure agents are best representing themselves, can we be sure their values are not the same. If an agent is unsure of zir values, or extrapolates them incorrectly, then there will be disagreement that doesn't imply different values. With seven billion people, none of which are best representing themselves (they certainly aren't perfect bayesians!) then we should expect massive disagreement. This is not an argument for fundamentally different values.

-2OrphanWilde13y

I disagree with the first statement, but agree with the second. That is, I disagree with a certainty that the problem is with our reasoning abilities, but agree that the evidence is very weak.

0Adele Lopez13y

Um, I said I was "pretty sure". Not absolutely certain.

-1OrphanWilde13y

Upvoted, and I'll consider it fair if you downvote my reply. Sorry about that!

1Adele Lopez13y

No worries!

1[anonymous]13y

I'm amused that you've retracted the post in question after posting this.

[-]Viliam_Bur13y90

There are some analogies between politics and friendliness, but the differences are also worth mentioning.

In politics, you design a system which must be implemented by humans. Many systems fail because of some property of human nature. Whatever rules you give to humans, if they have incentives to act otherwise, they will. Also, humans have limited intelligence and attention, lot of biases and hypocrisy, and their brains are not designed to work in communities with over 300 members, or to resist all the superstimuli of modern life.

If you construct a friendly AI, you don't have a problem with humans, besides the problem of extracting human values.

5OrphanWilde13y

I fully agree. I don't think even a perfect Friendliness theorem would suffice in making politics well and truly Friendly. Such an expectation is like expecting Friendly AI to work even while it's being bombarded with ionic radiation (or whatever) that is randomly flipping bits in its working memory.

2ikrase13y

Actually it's worse: It's like expecting to build a Friendly AI using a computer with no debugging utilities, an undocumented program interpreter, and a text editor that has a sense of humor. You have to implement it.

[-]Luke_A_Somers13y70

Politics is a harder problem than friendliness: politics is implemented with agents. Not only that, but largely self-selected agents who are thus usually not the ideal selections for implementing politics.

Friendliness is implemented (inside an agent) with non-agents you can build to task.

(edited for grammarz)

0OrphanWilde13y

Friendliness can only be implemented after you've solved the problem of what, exactly, you're implementing.

1Luke_A_Somers13y

Right, but the point is you don't need to get everyone to agree what's right (there's always going to be someone out there who's going to hate it no matter what you do). You just need it to actually be friendly... and, as hard as that is, at least you don't have to work with only corrupted hardware.

-7OrphanWilde13y

[-]Mitchell_Porter13y50

We can't agree on which political formations are more Friendly.

We also can't agree on, say, the correct theory of quantum gravity. But reality is there and it works in some particular way, which we may or may not be able to discover.

The values of a friendly AI are usually assumed to be an idealization of universal human values. More precisely: when someone makes a decision, it is because their brain performs a particular computation. To the extent that this computation is the product of a specific cognitive architecture universal to our species (and no... (read more)

1OrphanWilde13y

If such an idealization exists, that would of course be preferable. I suspect it doesn't, which may color my position here, but I think it's important to consider the alternatives if there isn't a generalizable ideal; specifically, we should be working from the opposing end, and try to generalize from the specific instances; even if we can't arrive at Strong Friendliness (the fully generalized ideal of human morality), we might still be able to arrive at Weak Friendliness (some generalized ideal that is at least acceptable to a majority of people). Because the alternative for those of us who aren't neurologists, as far as I can tell, is to wait.

[-]buybuydandavis13y30

That's what "Politics is the Mindkiller" is all about; our inability to come to an agreement on political matters.

In a sense, but most would not agree. I think all would agree that motivated cognition on strongly held values makes for some of the mindkilling.

I agree with what I take as your basic point, that people have different preferences, and Friendliness, political or AI, will be a trade off between them. But, many here don't. In a sense, you and I believe they are mindkilled, but in a different way - structural commitment to an incorre... (read more)

[-]turchin13y20

The real politic question is: should US government invest money in creating FAI, preventing existential risks and life extension?

0NancyLebovitz13y

Why just the US government?

0turchin13y

Of course, not only US government, but of all other countries, which have potential to influence AI research, existential risks. For example North Korea could play important role in existential risks, as it is said to develop small pox bioweapons. In my opinion, we need global government to address existential risks, and AI which will take over the world will be a form of global government. I was routinely downvoted for such posts and comments in LW, so it probably not appropriate place to discuss these issues.

0NancyLebovitz13y

Smallpox isn't an existential risk-- existential risks affect the continuation of the human race. So far as I know, the big ones are UFAI and asteroid strike. I don't know of classifications for very serious but smaller risks.

1turchin13y

Look, common smallpox is not existential risk, but biological weapons could be if they were specially designed to be existential risk. The simplest way to do it is simulanious use of many different pathogens. If we have 10 viruses with 50 per cent mortality, it would mean 1000 times reduction of human population, and this last million people would be very scattered and unadapted, so they could continue to extinction. North korea is said to develope 8 different bioweapons, but with progress of biotechnology it could be hundreds. But my main idea here was not a classification of existential risks, but to adress the idea that preventing them is the question of global politic - or it least it should be if we want to survive.

3OrphanWilde13y

Infectious agents with high mortality rates tend to weed themselves out of the population. There's a sweet spot for infectious disease; prolific enough to pass themselves on, not so prolific as to kill their host before they got the opportunity. Additionally, there's a strong negative feedback to particularly nasty disease in the form of quarantine. A much bigger risk to my mind actually comes from healthcare, which can push that sweet spot further into the "mortal peril" section. Healthcare provokes an arms race with infectious agents; the better we are at treating disease and keeping it from killing people, the more dangerous an infectious agent can be and still successfully propagate.

[-]The Dao of Bayes13y10

There's a value, call it "weak friendliness", that I view as a prerequisite to politics: it's a function that humans already implement successfully, and is the one that says "I don't want to be wire-headed, drugged in to a stupor, victim of a nuclear winter, or see Earth turned in to paperclips".

A hands-off AI overlord can prevent all of that, while still letting humanity squabble over gay rights and which religion is correct.

And, well, the whole point of an AI is that it's smarter than us, and thus has a chance of solving harder problems.

2TimS13y

I'm not sure this is true in any useful sense. Louis XIV probably agrees with me that "I don't want to be wire-headed, drugged in to a stupor, victim of a nuclear winter, or see Earth turned in to paperclips." But I think is is pretty clear than the Sun King was not implementing my moral preferences, and I am not implementing his. Either one of us is not "weak friendly" or "weak friendly" is barely powerful enough to answer really easy moral questions like "should I commit mass murder for no reason at all?" (Hint: no). If weak friendly morality is really that weak, then I have no confidence that a weak-FAI would be able to make a strong-FAI, or even would want to. In other words, I suspect that what most people mean by weak friendly is highly generalized applause lights that widely diverging values could agree with without any actual agreement on which actions are more moral.

0RomeoStevens13y

I think a lower bound on weak friendliness is whether or not entities living within the society consider their lives worthwhile. Of course this opens up debate about house elves and such but it's a useful starting point.

2Document13y

That (along with this semi-recent exchange) reminds me of a stupid idea I had for a group decision process a while back. * Party A dislikes the status quo. To change it, they declare to the sysop that they would rather die than accept it. * The sysop accepts this and publicly announces a provisionally scheduled change. * Party B objects to the change and declares that they'd rather die than accept A's change. * If neither party backs down, a coin is flipped and the "winner" is asked to kill the loser in order for their preference to be realized; face-to-face to make it as difficult as possible, thereby maximizing the chances of one party or the other backing down. * If the parties consist of multiple individuals, the estimated weakest-willed person on the majority side has to kill (or convince to forfeit) the weakest person on the minority side; then the next-weakest, until the minority side is eliminated. If they can't or won't, then they're out of the fight, and replaced with the next-weakest person, et cetera until the minority is eliminated or the majority becomes the minority. Basically, formalized war, only done in the opposite way of the strawman version in A Taste of Armageddon; making actual killing more difficult rather than easier. A few reasons it's stupid: * People will tolerate conditions much worse than death (for themselves, or for others unable to self-advocate) rather than violate the taboo against killing or against "threatening" suicide. * The system may make bad social organizations worse by removing the most socially enlightened and active people first. * People have values outside themselves, so they'll stay alive and try to work for change rather than dying pointlessly and leaving things to presumably get worse and worse from their perspective. * Prompting people to kill or die for their values will galvanize them and make reconciliation less likely. * Real policy questions aren't binary, and how a question is framed or what ord

0Document13y

Actually, I think I'm now remembering a better (or better-sounding) idea that occurred to me later: rather than something as extreme as deletion, let people "vote" by agreeing to be deinstantiated, giving up the resources that would have been spent instantiating them. It might be essentially the same as death if they stayed that way til the end of the universe, but it wouldn't be as ugly. Maybe they could be periodically awakened if someone wants to try to persuade them to change or withdraw their vote. That would hopefully keep people from voting selfishly or without thorough consideration. On the other hand, it might insulate them from the consequences of poor policies. Also, how to count votes is still a problem; where would "the resources that would have been spent instantiating them" come from? Is this a socialist world where everyone is entitled to a certain income, and if so, what happens when population outstrips resources? Or, in a laissez-faire world where people can run out of money and be deinstantiated, the idea amounts to plain old selling of votes to the rich, like we have now. Basically, both my ideas seem to require a eutopia already in place, or at least a genuine 100% monopoly on force. I think that might be my point. Or maybe it's that a simple-sounding, socially acceptable idea like "If someone would rather die than tolerate the status quo, that's bad, and the status quo should be changed" isn't socially acceptable once you actually go into details and/or strip away the human assumptions.

0RomeoStevens13y

Can this be set up in a round robin fashion with sets of mutually exclusive values such that everyone who is willing to kill for their values kills each other?

0Document13y

Maybe if the winning side's values mandated their own deaths. But then it would be pointless for the sysop to respond to their threat of suicide to begin with, so I don't know. I'm not sure if there's something you're getting at that I'm not seeing.

0OrphanWilde13y

"I'm not going to live there. There's no place for me there... any more than there is for you. Malcolm... I'm a monster.What I do is evil. I have no illusions about it, but it must be done. " * The Operative, from Serenity. (On the off-chance that somebody isn't familiar with that quote.)

0RomeoStevens13y

I'm thinking if you do the matchup's correctly you only wind up with one such person at the end, whom all the others secretly precommit to killing. ...maybe this shouldn't be discussed publicly.

0Document13y

I don't think the system works in the first place without a monopoly on lethal force. You could work within the system by "voting" for his death, but then his friends (if any) get a chance to join in the vote, and their friends, til you pretty much have a new war going. (That's another flaw in the system I could have mentioned.)

0The Dao of Bayes13y

I think the vast majority of the population would agree that genocide and mass murder are bad, same as wire heading and turning the earth in to paperclips. A single exception isn't terribly noteworthy - I'm sure there's at least a few pro-wire-heading people out there, and I'm sure at least a few people have gotten enraged enough at humanity to think paperclips would be a better use of the space. If you have a reason to suspect that "mass murder" is a common preference, that's another matter.

1TimS13y

Mass murder is an easy question. Is the Sun King (who doesn't particularly desire pointless mass murder) more moral than I am? Much harder, and your articulation of "weak Friendliness" seems incapable of even trying to answer. And that doesn't even get into actual moral problems society actually faces every day (i.e. what is the most moral taxation scheme?). If weak-FAI can't solve those types of problems, or even suggest useful directions to look, why should we believe it is a step on the path to strong-FAI?

0The Dao of Bayes13y

That's my point. I'm not sure where the confusion is, here. Why would you call it useless to prevent wireheading, UFAI, and nuclear winter, just because it can't also do your taxes? If it's easier to solve the big problems first, wouldn't we want to do that? And then afterwards we can take our sweet time figuring out abortion and gay marriage and tax codes, because a failure there doesn't end the species.

3TimS13y

For reasons related to Hidden Complexity of Wishes, I don't think weak-FAI actually is likely to prevent "wireheading, UFAI, and nuclear winter." At best, it prohibits the most obvious implementations of those problems. And it is terribly unlikely to be helpful in creating strong-FAI. And your original claim was that common human preferences already implement weak-FAI preferences. I think that the more likely reason why we haven;t had the disasters you reference is that for most of human history, we lacked the capacity to cause those problems. As actual society shows, hidden complexity of wishes make implementing social consensus hopeless, much less whatever smaller set of preferences is weak-FAI preferences.

1The Dao of Bayes13y

My basic point was that we shouldn't worry about politics, at least not yet, because politics is a wonderful example of all the hard questions in CEV, and we haven't even worked out the easy questions like how to prevent nuclear winter. My second point was that humans do seem to have a much clearer CEV when it comes to "prevent nuclear winter", even if it's still not unanimous. Implicit in that should have been the idea that CEV is still ridiculously difficult. Just like intelligence, it's something humans seem to have and use despite being unable to program for it. So, then, summarized, I'm saying that we should perhaps work out the easy problems first, before we go throwing ourselves against harder problems like politics.

1TimS13y

There's not a clear dividing line between "easy" moral questions and hard moral questions. The Cold War, which massively increased the risk of nuclear winter, was a rational expression of Great Power relations between two powers. Until we have mutually acceptable ways of resolving disputes when both parties are rationally protecting their interests, we can't actually solve the easy problems either.

0The Dao of Bayes13y

from you: and from me: So, um, we agree, huzzah? :)

0fubarobfusco13y

Sure, genocide is bad. That's why the Greens — who are corrupting our precious Blue bodily fluids to exterminate pure-blooded Blues, and stealing Blue jobs so that Blues will die in poverty — must all be killed!

2gwern13y

We usually call that the 'sysop AI' proposal, I think.

2OrphanWilde13y

There's a bootstrapping problem inherent to handing AI the friendliness problem to solve. Edit: Unless you're suggesting we use a Weakly Friendly AI to solve the hard problem of Strong Friendliness?

5The Dao of Bayes13y

Your edit pretty much captures my point, yes :) If nothing else, a Weak Friendly AI should eliminate a ton of the trivial distractions like war and famine, and I'd expect that humans have a much more unified volition when we're not constantly worried about scarcity and violence. There's not a lot of current political problems I'd see being relevant in a post-AI, post-scarcity, post-violence world.

2Dre13y

The problem is that we have to guarantee that the AI doesn't do something really bad while trying to stop these problems; what if it decides it really needs more resources suddenly, or needs to spy on everyone, even briefly? And it seems (to me at least) that stopping it from having bad side effects is pretty close, if not equivalent to, Strong Friendliness.

0The Dao of Bayes13y

I should have made that more clear: I still think Weak-Friendliness is a very difficult problem. My point is simply that we only need an AI that solves the big problems, not an AI that can do our taxes. My second point was that humans seem to already implement weak-friendliness, barring a few historical exceptions, whereas so far we've completely failed at implementing strong-friendliness. I'm using Weak vs Strong here in the sense of Weak being a "SysOP" style AI that just handles catastrophes, whereas Strong is the "ushers in the Singularity" sort that usually gets talked about here, and can do your taxes :)

2OrphanWilde13y

This... may be an amazing idea. I'm noodling on it.

0[anonymous]13y

Edit: Completely misread the parent.

0Rukifellth13y

I know this wasn't the spirit of your post, but I wouldn't refer to war and famine as "trivial distractions".

0Rukifellth13y

Wait, if you're regarding the elimination of war, famine and disease as consolation prizes for creating an wFAI, what are people expecting from a sFAI?

2Fadeway13y

God. Either with or without the ability to bend the currently known laws of physics.

2Rukifellth13y

No, really.

5Richard_Kennaway13y

Really. That really is what people are expecting of a strong FAI. Compared with us, it will be omniscient, omnipotent, and omnibenevolent. Unlike currently believed-in Gods, there will be no problem of evil because it will remove all evil from the world. It will do what the Epicurean argument demands of any God worthy of the name.

0Rukifellth13y

Are you telling me that if a wFAI were capable of eliminating war, famine and disease, it wouldn't be developed first?

3Richard_Kennaway13y

Well, I don't take seriously any of these speculations about God-like vs. merely angel-like creations. They're just a distraction from the task of actually building them, which no-one knows how to do anyway.

0Rukifellth13y

But still, if a wFAI was capable of eliminating those things, why be picky and try for sFAI?

1RomeoStevens13y

Because we have no idea how hard it is to specify either. If, along the way it turns out to be easy to specify wFAI and risky to specify sFAI, then the reasonable course is expected. Doubly so since a wFAI would almost certainly be useful in helping specify a sFAI. Seeing as human values are a miniscule target, it seems probable that specifying wFAI is harder than sFAI though.

0Rukifellth13y

"Specify"? What do you mean?

0RomeoStevens13y

specifications a la programming.

0Rukifellth13y

Why would it be harder? One could tell the wFAI improve factors that are strongly correlated with human values, such as food stability, resources that cure preventable diseases (such as diarrhea, which, as we know, kills way more people than it should) and security from natural disasters.

0RomeoStevens13y

Because if you screw up specifying human values you don't get wFAI you just die (hopefully).

0Rukifellth13y

It's not optimizing human values, it's optimizing circumstances that are strongly correlated with human values. It would be a logistics kind of thing.

2RomeoStevens13y

Have you ever played corrupt a wish?

0Rukifellth13y

No, but I'm guessing I'm about to. "I wish for a list of possibilities for sequences of actions, any of whose execution would satisfy the following conditions. * Within twenty years, for Nigeria to have standards of living such that it would receive the same rating as Finland on [Placeholder UN Scale of People's-Lives-Not-Being-Awful]." The course of action would be evaluated by a think-tank, until they decided that the course of actions was acceptable, and the wFAI was given the go.

0RomeoStevens13y

The AI optimizes only for that and doesn't generate a list of non-obvious side effects. You implement one of them and something horrible happens to finland, and or countries beside nigeria. or In order to generate said list I simulate Nigeria millions of times to a resolution such that entities within the simulation pass the turing test. Most of the simulations involve horrible outcomes for all involved. or I generate such a list including many sequences of actions that lead to a small group being able to take over nigeria and or finland and or the world. (or generates some other power differential that screws up international relations) or In order to execute such an action I need more computing power, and you forgot to specify what are acceptable actions for obtaining it. or The wFAI is much cleverer than a single human thinking about this for 2 minutes and can screw things up in ways that are as opaque to you as human actions are to a dog. In general, specifying an oracle/tool AI is not safe: http://lesswrong.com/lw/cze/reply_to_holden_on_tool_ai/ Even more generally, our ability to build an AI that is friendly will have nothing to do with our ability to generate clauses in english that sound reasonable.

[-]Mimosa13y00

Part of the problem is the many factors involved in the political issues. People explain things through their own specialty, but lack knowledge of other specialties.

[-]Decius13y00

Why do you restrict Strong Friendliness to human values? Is there some value which an intelligence can have that can never be a human value?

0OrphanWilde13y

Because we're the one that has to live with the thing, and I don't know but my inclination is that the answer is "Yes"

0Decius13y

Implication: A Strongly Friendly (paperclip maximizer) AI is actually a meaningful phrase. (As opposed to all Strongly Friendly AIs being compatible with everyone) Why all human values?

[-]Kawoomba13y00

You're making the perfect the enemy of the good.

I'm fine with at least a thorough framework for Weak Friendliness. That's not gonna materialize out of nothing. There are no actual Turing Machines (infinite tapes required), yet it is a useful model and its study yields useful results for real world applications.

Studying Strong Friendliness is a useful activity in finding a heuristic for best-we-can-do friendliness, which is way better than nothing.

[-]JoshuaFox13y-10

Politics as a process doesn't generate values; they're strictly an input,

Politics is part about choosing goals/values. (E.g., do we value equality or total wealth?) It is also about choosing the means to achieving the goals. And it is also about signaling power. Most of these are not relevant to designing a future Friendly AI.

Yes, a polity is an "optimizer" in some crude sense, optimizing towards a weighted sum of the values of its members with some degree of success. Corporations and economies have also been described as optimizers. But I don't see too much similarity to AI design here.

2[anonymous]13y

Deciding what we value isn't relevant to friendliness? Could you explain that to me?

2Larks13y

The whole point of CEV is that we give the AI an algorithm for educing our values, and let it run. At no point do we try to work them out ourselves.

0[anonymous]13y

I mentally responded to you and forgot to, you know, actually respond. I'm a bit confused by this and since it was upvoted I'm less sure I get CEV.... It might clear things up to point out that I'm making a distinction between goals or preferences vs. values. CEV could be summarized as "fulfill our ideal rather than actual preferences", yeah? As in, we could be empirically wrong about what would maximize the things we care about, since we can't really be wrong about what to care about. So I imagine the AI needing to be programmed with our values- the meta wants that motivate our current preferences- and it would extrapolate from them to come up with better preferences, or at least it seems that way to me. Or does the AI figure that out too somehow? If so, what does an algorithm that figures out our preferences and our values contain?

3Larks13y

Ha, yes, I often do that. The motivation behind CEV also includes the idea we might be wrong about what we care about. Instead, you give your FAI an algorithm for * Locating people * Working out what they care about * Working out what they would care about if they knew more, etc. * Combining these preferences I'm not sure what distinction you're trying to draw between values and preferences (perhaps a moral vs non-moral one?), but I don't think it's relevant to CEV as currently envisioned.

1JoshuaFox13y

Actually, when I said "most" in "most of these are not relevant to designing a future Friendly AI," I was thinking that values are the exception, they are relevant.

0[anonymous]13y

Oh. Then yeah ok I think I agree.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

8

Why Politics are Important to Less Wrong...

8

8