Wiki Contributions


Danger(s) of theorem-proving AI?

Since the question is about potential dangers, I think it is worth assuming the worst here. Also, realistically, we don't have a magic want to pop things into existence by fiat so I would guess that by default if such an AI was created it would be created with ML. 

So lets say that this is trained largely autonomously with ML. Is there some way that would result in dangers outside the four already-mentioned categories?

Thinking About a Technical Solution to Coordination Problems

Clearly, you and I have different definitions of "easy".

Open thread, Aug. 10 - Aug. 16, 2015

This was a terrific post; insightful and entertaining in excess of what can be conveyed by an upvote. Thank you for making it.

Beware the Nihilistic Failure Mode

What you're proposing sounds more like moral relativism than moral nihilism.

Ah, yes. My mistake. I stand corrected. Some cursory googling suggests that you are right. With that said, to me Moral Nihilism seems like a natural consequence of Moral Relativism, but that may be a fact about me and not the universe, so to speak (though I would be grateful if you could point out a way to be morally relativist without morally nihilist).

I think that you're confusing moral universalism with moral absolutism and value monism.

The last paragraph of my previous post was a claim that unless you an objective way of ordering conflicting preferences (and I don't see how you can), you are forced to work under value pluralism. I did use this as an argument against moral universalism , though that argument may not be entirely correct. I concede the point.

Beware the Nihilistic Failure Mode

But there's that language again that people use when they talk about moral nihilism, where I can't tell if they're just using different words, or if they really think that morality can be whatever we want it to be, or that it doesn't mean anything to say that moral propositions are true or false.

Okay. Correct me if any of this doesn't sound right. When a person talks about "morality", you imagine a conceptual framework of some sort - some way of distinguishing what makes actions "good" or "bad", "right" or "wrong", etc. Different people will imagine different frameworks, possibly radically so - but there is generally a lot of common ground (or so we hope), which is why you and I can talk about "morality" and more or less understand the gist of each other's arguments. Now, I would claim that what I mean when I say "morality", or what you mean, or what a reasonable third party may mean, or any combination thereof - that each of these is entirely unrelated to ground truth.

Basically, moral propositions (e.g. "Murder is Bad") contain unbound variables (in this case, "Bad") which are only defined in select subjective frames of reference. "Bad" does not have a universal value in the sense that "Speed of Light" or "Atomic Weight of Hydrogen" or "The top LessWrong contributor as of midnight January 1st, 2015" do. That is the main thesis of Moral Nihilism as far as I understand it. Does that sound sensible?

I wouldn't ask people those questions. People can be wrong about what they value. The point of moral philosophy is to know what you should do.

Alright; let me rephrase my point. Let us say that you have access to everything there that can be known about a individual X. Can you explain how you compute their objective contingent morality to an observer who has no concept of morality? You previous statement of "what is moral is what you value" would need to define "what you value" before it would suffice. Note that unless you can do this construction, you don't actually have something objective.

Beware the Nihilistic Failure Mode

I think that using this notation is misleading. If I am understanding you correctly, you are saying that given an individual, we can derive their morality from their (real/physically grounded) state, which gives real/physically grounded morality (for that individual). Furthermore, you are using "objective" where I used "real/physically ground". Unfortunately, one of the common meanings of objective is "ontologically fundamental and not contingent", so your statement sounds like it is saying something that it isn't.

On a separate note, I'm not sure why you are casually dismissing moral nihilism as wrong. As far as I am aware, moral nihilism is the position that morality is not ontologically fundamental. Personally, I am a moral nihilist; my experience shows that morality as typically discussed refers to a collection of human intuitions and social constructs - it seems bizarre to believe that to be an ontologically fundamental phenomenon. I think a sizable fraction of LW is of like mind, though I can only speak for myself.

I would even go further and say that I don't believe in objective contingent morality. Certainly, most people have an individual idea of what they find moral. However, this only establishes that there is an objective contingent response to the question "what do you find moral?" There is similarly an objective contingent response to the related question "what is morality?", or the question "what is the difference between right and wrong?" Sadly, I expect the responses in each case to differ (due to framing effects, at the very least). To me, this shows that unless you define "morality" quite tightly (which could require some arbitrary decisions on your part), your construction is not well defined.

Note that I expect that last paragraph to be more relativist then most other people here, so I definitely speak only for myself there.

I need a protocol for dangerous or disconcerting ideas.

Okay, but at best, this shows that the immediate cause of you being shaken and coming out of it is related to fearful epiphanies. Is it not plausible that the reason that, at a given time, you find particular idea horrific or are able to accept a solution as satisfying depending on your mental state?

Consider this hypothetical narrative. Let Frank (name chosen at random) be a person suffering from occasional bouts of depression. When he is healthy, he notices an enjoys interacting with the world around him. When he is depressed, he instead focuses on real or imagined problems in his life - and in particular, how stressful his work is.

When asked, Frank explains that his depression is caused by problems at work. He explains that when he gets assigned a particularly unpleasant project, his depression flares up. The depression doesn't clear up until things get easier. Frank explains that once he finishes a project and is assigned something else, his depression clears up (unless the new project is just as bad); or sometimes, through much struggle, he figure out how to make the project bearable, and that resolves the depression as well.

Frank is genuine in expressing his feelings, and correct about work problems being correlated with his depression, but he is wrong about causation between the two.

Do you find this story analogous to your situation? If not, why not?

I need a protocol for dangerous or disconcerting ideas.


Just wanted to say that this is well thought out and well written - it is what I would have tried to say (albeit perhaps less eloquently) if it hadn't been said already. I wish I had more than one up-vote to give.


I would urge you to give the ideas here more thought. Part of the point here is that from you are going to be strongly biased for thinking your explanations are of the first sort and not the second. By virtue of being human, you are almost certainly biased in certain predictable ways, this being one of them. Do you disagree?

Let me ask you this: what would it take to make you change your mind; i.e. that the explanation for this pattern is one of the latter three reasons and not the former three reasons?

I need a protocol for dangerous or disconcerting ideas.

I definitely know that my depression is causally tied to my existential pessimism.

Out of curiosity, how do you know that this is the direction of the causal link? The experiences you have mentioned in the thread seem to also be consistent with depression causing you to get hung up on existential pessimism.

A Proposal for Defeating Moloch in the Prison Industrial Complex

Your argument assumes that the algorithm and the prisons have access to the same data. This need not be the case - in particular, if a prison bribes a judge to over-convict, the algorithm will be (incorrectly) relying on said conviction as data, skewing the predicted recidivism measure.

That said, the perverse incentive you mentioned is absolutely in play as well.

Load More