Yet Another "Rational Approach To Morality & Friendly AI Sequence"

by mwaser1 min read6th Nov 201061 comments

-13

Personal Blog

Premise:  There exists a community whose top-most goal is to maximally and fairly fulfill the goals of all of its members.  They are approximately as rational as the 50th percentile of this community.  They politely invite you to join.  You are in no imminent danger.

 

Do you:

  • Join the community with the intent to wholeheartedly serve their goals
  • Join the community with the intent to be a net positive while serving your goals
  • Politely decline with the intent to trade with the community whenever beneficial
  • Politely decline with the intent to avoid the community
  • Join the community with the intent to only do what is in your best interest
  • Politely decline with the intent to ignore the community
  • Join the community with the intent to subvert it to your own interest
  • Enslave the community
  • Destroy the community
  • Ask for more information, please

 

Premise:  The only rational answer given the current information is the last one.

 

What I’m attempting to eventually prove The hypothesis that I'm investigating is whether "Option 2 is the only long-term rational answer". (Yes, this directly challenges several major current premises so my arguments are going to have to be totally clear.  I am fully aware of the rather extensive Metaethics sequence and the vast majority of what it links to and will not intentionally assume any contradictory premises without clear statement and argument.)

 

It might be an interesting and useful exercise for the reader to stop and specify what information they would be looking next for before continuing.  It would be nice if an ordered list could be developed in the comments.

 

Obvious Questions:

 

<Spoiler Alert>

 

 

  1. What happens if I don’t join?
  2. What do you believe that I would find most problematic about joining?
  3. Can I leave the community and, if so, how and what happens then?
  4. What are the definitions of maximal and fairly?
  5. What are the most prominent subgoals?/What are the rules?

 

-13

61 comments, sorted by Highlighting new comments since Today at 8:04 PM
New Comment

I feel like a jerk for saying this, but in the four days since you announced your intention to cut back on top-level posting for a while, this is the second top-level post you've made.

To be blunt, you are violating community norms by posting large quantities of material despite general disinterest or disapproval from other community members. Before making future posts, please try to calculate the expected value of their content for your readers. When in doubt, refrain from posting.

This discussion section is intended to have lower standards than the main section of the site, but even so those standards are much higher than those of almost any other internet discussion forum.

While I can't speak for anyone else here, I would appreciate it if you would cease making top-level posts entirely until your karma rises above 0.

Too ambiguous (something of premature abstraction). It's not clear what most of the elements of this post refer to, so it's not possible to have a clear discussion about them.

Too ambiguous. It's not clear which elements aren't clear to you, so it's not possible to fix the problem.

Pretty much everything. To fix the problem, give an example.

Thank you very much.

Premise: Most developed nations are such a community although the goal is certainly not explicit.

Do you believe that premise is flawed?

I think that premise is very wrong. If "developed nations" is the model you had in mind while writing, I can understand why most commentors find this post confusing. I guessed you meant something like an internet community like LW. Attempting to abstract over these things seems problematic, as pointed out by Vladimir Nesov.

What does it mean to "join" a nation? To be "invited to join"? To choose whether to do so or not? In what sense does a nation have a top-level goal (explicit or otherwise)? In what sense is a nation rational or otherwise? How does a nation identify the goals of its members?

Acquiring citizenship is joining a nation. People who are not only allowed to acquire citizenship but encouraged to do so are "invited to join". To choose whether to do so or not is to file the necessary papers and perform the necessary acts. I think that these answers should be obvious.

A nation has a top-most goal if all of its goals do not conflict with that goal. This is more specific than a top-level goal.

A nation is rational to the extent that its actions promote its goals. Did you really have to ask this?

How does a nation identify the goals of its members? My immediate reaction is the quip "Not very well". A better answer is "that is what government is supposed to be for". I have no interest and no intention to get into politics. The problem with my providing a specific example, particularly one that falls short in the rationality department from what was stated in the premise, is that people tend to latch on to the properties of the example in order to argue rather than considering the premise. Current "developed nations" are a very poor, imperfect, irrational echo of the model I had in mind but they are the closest existing (and therefore easily/clearly cited) example I could think of.

In fact, let me change my example to a theoretical nation where Eliezer has led a group of the best and brightest LessWrong individuals to create a new manmade-island-based nation with a unique new form of government. Would you join if invited?

Would you join if invited?

And this is still too abstract. Depending on detail of the situation, either decision might be right. For example, I might like to remain where I am, thank you very much.

Worse, so far I've seen no motivation for the questions of this post, and what discussion happened around it was fueled by making equally unmotivated arbitrary implicit assumptions not following from the problem statement in the post. It's the worst kind of confusion when people start talking about the topic as if understanding each other, when in fact the direction of their conversation is guided by any reasons but the content of the topic in question. Cargo cult conversation (or maybe small talk).

And this is still too abstract. Depending on detail of the situation, either decision might be right. For example, I might like to remain where I am, thank you very much.

So I take it that you are heavily supporting the initial post's "Premise: The only rational answer given the current information is the last one."

Worse, so far I've seen no motivation for the questions of this post, and what discussion happened around it was fueled by making equally unmotivated arbitrary implicit assumptions not following from the problem statement in the post.

Thank you. I didn't clearly understand the need for the explicit inclusion of motivation before.

The reason I ask questions which you think have obvious answers is that I think the easily-stated obvious answers make large, blurry assumptions. For example:

A nation is rational to the extent that its actions promote its goals.

What are the actions of a nation? The aggregate actions of the population? Those of the head of state? What about lower-level officials in government? Large companies based in the nation?

A nation has a top-most goal if all of its goals do not conflict with that goal.

Ok, I should have started with a more basic question then. What does it mean for a nation to have any goal?

I agree that nations are not a great example. After all, acquiring citizenship usually means emigration, new rights of travel, change in economic circumstances and often loss of previous citizenship. All of these overwhelm any considerations about rationality of the new nation.

Ah. Now I see your point.

The actions of a nation are those which were caused by it's governance structure like your actions are those which are caused by your brain. A fever or your stomach growling is not your action in the same sense that actions by lower-level officials and large companies are not the actions of a nation -- particularly when those officials and companies are subsequently censured or there is some later attempt to rein them in. Actions of the duly recognized head of state acting in a national capacity are actions of the nation unless they are subsequently over-ruled by rest of the governance structure -- which is pretty much the equivalent of your having an accident or making a mistake.

A nation has explicit goals when it declares those goals through it's governance structure.

A nation has implicit goals when it's governance structure appears to be acting in a fashion resembling rational behavior for having those goals and there is not an alternative explanation.

I proposed a specific source of ambiguity elsewhere in the thread.

[-][anonymous]11y 6

How is "investigating is whether Option 2 is the only long-term rational answer" different from investigating which options are long-term rational answers? And why are you choosing to focus on the former, rather than on the latter?

As a heads up, that point has been addressed (from a slightly different angle) elsewhere in the thread. You might yet get it across better than I did, though.

ETA: Oh, I guess you're objecting to the wording change which was made in response to the earlier comment. Carry on then.

"Option 2 is the only long-term rational answer" is a clear hypothesis. It is disproved if any of the other options is also a long-term rational answer. "Which options are long-term rational answers?" is a question, not a hypothesis.

Reread Einstein's Arrogance

[-][anonymous]11y 4

In other words, you're still investigating the same things (possibly with different stopping criteria -- e.g. you'd be done if you disproved your hypothesis), but you have substantial evidence in favor of your hypothesis already. Am I understanding you correctly?

I'm not sure the blog post you're linking to is helpful, though. One could come up with your list of options without having done any prior investigation. In other words, unlike Einstein, it's entirely plausible to be at the stage where you're considering Option 2 without having evidence favoring Option 2 over the others. And even if you have 50% certainty in Option 2, that only implies 3-4 bits of evidence.

And I think the mistrust you see in the comments is due precisely to the absence of evidence from your post. Which is weakly evidence of absence. Granted, I don't think your post is intended to present all your evidence, but seeing some of it first would help frame your discussion.

Upvote from me! Yes, you are understanding me correctly.

One could indeed come up with my list of options without having done any prior investigation. But would one share it with others? My pointing at that particular post is meant to be a signal that I grok that it is not rational to share it with others until I believe that I have strong evidence that it is a strong hypothesis and have pretty much run out of experiments that I can conduct by myself that could possibly disprove the hypothesis.

Skepticism is desired as long as it doesn't interfere with the analysis of the hypothesis. If mistrust leads someone to walk away from a hypothesis that would be of great interest to them, if true, without fairly analyzing the hypothesis, that's a problem.

Yes, I realize that I still am lacking some of the skills necessary to present and frame a discussion here. I should have presented an example as Vladimir pointed out. I'm under the impression that evidence isn't necessarily appropriate at this point. If people would leap in to correct me if that is incorrect, it would be appreciated.

The question "which options are long-term rational answers?" corresponds immediately to the hypothesis "among the options are some long-term rational answers" and can be investigated in the same way.

Mind you, "long-term rational answer" is not well-defined; I guess you mean something influenced by ideas like Nash equilibrium and evolutionarily stable strategy. What is a "short-term rational answer"?

The post you link to is irrelevant to Misha's reasonable question, except insofar as it contains discussion of hypotheses. If you really think that people here need to be educated as to what a hypothesis is, then a) it'd be better to link to a wikipedia definition and b) why are you bothering to post here?

The question "which options are long-term rational answers?" corresponds immediately to the hypothesis "among the options are some long-term rational answers" and can be investigated in the same way.

Incorrect. Prove that one option is a long-term rational answer and you have proved the hypothesis "among the options are some long-term rational answers". That is nowhere near completing answering the question "which options are long-term rational answers"

My hypothesis was much, much more limited than "among the options are some long-term rational answers". It specified which of the options was a long-term rational answer. It further specified that all of the other options were not long-term rational answers. It is much, much easier to disprove my hypothesis than the broader hypothesis "among the options are some long-term rational answers" which gives it correspondingly more power.

If you really think that people here need to be educated as to what a hypothesis is, then a) it'd be better to link to a wikipedia definition and b) why are you bothering to post here?

Fully grokking Eliezer's post that I linked would have given you all of the above reply. The wikipedia definition is less clear than Eliezer's post. I post here because this community is more than capable of helping/forcing me to clarify my logic and rationality.

Could someone give me a hint as to why this particular comment which was specifically in answer to a question is being downvoted? I don't get it.

I didn't downvote because you were right that the hypothesis I provided (there are some rational options) was not equivalent to the question (which are the rational options). This is quite a fundamental point, so extra black marks to me for being careless.

However, Einstein's Arrogance doesn't deal with this fundamental point, so I disagree with "would have given you all of the above reply" and still dispute its relevance to Misha's original comment.

ETA: also you didn't address "what is a short-term rational answer?". Maybe these are possible reasons for downvoting?

Conditional on the weaker claim that you should join the community, option 5 is tautologically the correct one, you just happened to have phrased it in a way that sounds evil. "Do what is rational" and "do what is in your best interest" both mean "take the most effective actions to optimise your utility function".

My first question would be "How do you go about trying to fulfill the community's metagoal?" which is very nearly the same question as "What does it mean to be a member of this community?"

But my question for you is, why do you already know what you're eventually trying to prove when you haven't even settled on which questions to ask yet? Data (even hypothetical data) first, conclusions after.

Too abstract, I don't understand. Please explain the motivation and describe the question more thoroughly.

Also, upvoted because while I think this post was in error, I think it is better that buggy thinking be exposed and corrected rather than continue to be held in private. Rationality isn't about being more right, it's about becoming more right than you currently are, and it appears (maybe I'm wrong about this?) that mwaser has good intentions in the way of this.

Thank you. As I said below, I didn't clearly understand the need for the explicit inclusion of motivation before. I now see that I need to massively overhaul the question and include motivation (as well as make a lot of other recommended changes).

The post has a ton of errors but I don't understand why you think it was in error. Given that your premise about my intentions is correct, doesn't your argument mean that posting was correct? Or, are you saying that it was in error due to the frequency of posting?

The post has a ton of errors but I don't understand why you think it was in error.

Tricky words. I meant simply that it had errors. Of course I agree that even a flawed post is useful (in that it helps to expose buggy thinking), but here it seems like you're attempting to argue about what it means for a post to be "in error." Taboo the word "error" and I don't think we disagree.

[-][anonymous]11y 0

Your "first question" is excellent. Your question for me is even better.

What I'm trying to eventually prove is called a hypothesis. If I can disprove it, that is equally valuable to me.

Hypothesis first. Experimental design second. Then data and conclusions.