Common mistakes people make when thinking about decision theory

Here's a few more from my list:

Answering "The right solution to this decision problem is X" (and seemly being satisfied with that) when the answer that's generally wanted is of the form "Y is the right decision theory, and here's why Y gives the right answers to this and other tricky decision problems".

Taking speculative ideas too seriously and trying to apply them to real life before the necessary details have been worked out.

Doing decision theory research might be a mistake in itself, if your goal is a positive Singularity and not advancing decision theory per se or solving interesting puzzles. (I had a philosophical interest in decision theory before I came to OB/LW. Cousin_it sees it mainly as a source of cool math problems. So both of us are excused. :)

[-][anonymous]14y60

Doing decision theory research might be a mistake in itself, if your goal is a positive Singularity and not advancing decision theory per se or solving interesting puzzles.

Isn't it sort of a moral imperative to familiarize oneself with the foundations of decision theory? It seems sort of important for understanding the foundations of epistemology, morality, ontology of agency, et cetera, which are things it'd be helpful to understand if you were trying to be morally justified. I guess this is to some extent what you meant by "philosophical interest"? -- Will Newsome on Luke's computer

[-]cousin_it14y10

The first one is a perfect fit, I've seen many people get stuck in exactly that way. Thanks!

[-]patrickscottshields14y30

My education in decision theory has been fairly informal so far, and I've had trouble understanding some of your recent technical posts because I've been uncertain about what assumptions you've made. I think more explicitly stating your assumptions could lessen the frequency of arguments about assumptions by decreasing the frequency of readers mistakenly believing you've made different assumptions. It could also decrease inquiries about your assumptions, like the one I made on your post on the limited predictor problem.

One way to do this could be to, in your posts, link to other works that define your assumptions. Such links could also function to connect less-experienced readers with relevant background reading.

[-]cousin_it14y30

Do you understand these posts now, or do you have any other questions about them? I'll be glad to answer your questions and use that to learn how to communicate more clearly.

[-]patrickscottshields14y20

I have several questions. I hadn't asked them because I thought I should do more research before taking up your time. Here are some examples:

What does it mean to solve the limited predictor problem? In what form should a solution be—an agent program?
What is a decision, more formally? I'm familiar with the precondition/effect paradigm of classical AI planning but I've had trouble conceptualizing Newcomb's problem in that paradigm.
What, formally, is an agent? What parameters/inputs do your agent programs take?
What does it mean for an agent prove a theorem in some abstract formal system S?

I will plan to do more research and then ask more detailed questions in the relevant discussion threads if I still don't understand.

I think my failure to comprehend parts of your posts is more due to my lack of familiarity with the subject matter than your communication style. Adding links to works that establish the assumptions or formal systems you're using could help less advanced readers start learning that background material without you having to significantly lengthen your posts.

Thanks for the help!

[-]cousin_it14y110

1) Yes, the solution should be an agent program. It can't be something as simple as "return 1", because when I talk about solving the LPP, there's an implicit desire to have a single agent that solves all problems similar enough to the LPP, for example the version where the agent's actions 1 and 2 are switched, or where the agent's source code has some extra whitespace and comments compared to its own quined representation, etc.

2) We imagine the world to be a program with no arguments that returns a utility value, and the agent to be a subprogram within the world program. Even though the return value of an argumentless program is just a constant, the agent can still try to "maximize that constant", if the agent is ignorant about it in just the right way. For example, if the world program calls the agent program and then returns 0 or 1 depending on whether the agent's return value was even or odd, and the agent can prove a theorem to that effect by looking at the world program's source code, then it makes sense for the agent to return an odd value.

Newcomb's Problem can be formalized as a world program that makes two calls to the agent (or maybe one call to the agent and another call to something provably equivalent). The first call's return value is used to set the contents of the boxes, and the second one represents the agent's actual decision. If a smart enough agent receives the world's source code as an argument (which includes possibly mangled versions of the agent's source code inside), and the agent knows its own source code by quining), then the agent can prove a theorem saying that one-boxing would logically imply higher utility than two-boxing. That setting is explored in a little more detail here.

Before you ask: no, we don't know any rigorous definition of what it means to "maximize" the return value of an argumentless program in general. We're still fumbling with isolated cases, hoping to find more understanding. I'm only marginally less confused than you about the whole field.

3) You can think about an agent as a program that receives the world's source code as an argument, so that one agent can solve many possible world programs. I usually talk about agents as if they were argumentless functions that had access to the world's source code via quining, but that's just to avoid cluttering up the proofs. The results are the same either way.

4) Usually you can assume that S is just Peano arithmetic. You can represent programs by their Gödel numbers, and write a PA predicate saying "program X returns integer Y". You can also represent statements and proofs in PA by their Gödel numbers, and write a PA predicate saying "proof X is a valid proof of statement Y". You can implement both these predicates in your favorite programming language, as functions that accept two integers and return a boolean. You can have statements referring to these predicates by their Gödel numbers. The diagonal lemma gives you a generalized way to make statements refer to themselves, and quining allows you to have programs that refer to their own source code. You can have proofs that talk about programs, programs that enumerate and check proofs, and generally go wild.

For example, you can write a program P that enumerates all possible proofs trying to find a valid proof that P itself returns 1, and returns 1 if such a proof is found. To prove that P will in fact return 1 and not loop forever, note that it's just a restatement of Löb's theorem. That setting is explored in a little more detail here.

Please let me know if the above makes sense to you!

[-]patrickscottshields14y00

Thank you. Your comment resolved some of my confusion. While I didn't understand it entirely, I am happy to have accrued a long list of relevant background reading.

[-]Wei Dai14y30

Which of these mistakes do you attribute to insufficient rationality? Which are due to insufficient intelligence? Which are "I made the best bet possible given the information I had or could have obtained, and just turned out to be wrong"?

1: Nash equilibria (such as D,D in PD) essentially assumes independence between different agent's decisions: P(B()=b | A()=a) = P(B()=b | A()!=a). It took Eliezer to realize that this assumption is not always valid and the opposite assumption may be more relevant for some decision problems, especially those involving AIs. If he didn't "argue" about assumptions, how would he transmit his insight to others? You observe a correlation between less arguing over assumptions and more interesting discussions/results, but isn't it possible that both are caused by higher intelligence (i.e., smarter people could more quickly see that Eliezer had a point)?

2: This is more clearly a failure of rationality. In these situations I think one ought to ask "Why do I think I have the right solution when there are so many smart people who profess to be confused or disagree with me? Am I sure they haven't already thought through my proposed solution and found it wanting, and I'm also confused but just don't realize it?"

2A: From Eliezer's perspective at that time, he thought he had the right central insight to the problem and there were just technical loose ends to be tied up, and his time could be better spent doing other things. It's not clear that was a mistake, even in retrospect, especially if you consider that his interest was eventually building an FAI, not doing decision theory research per se.

[-]cousin_it14y30

All these mistakes look more like failures of rationality to me, because smart people make them too.

It took Eliezer to realize that this assumption is not always valid

At least some of the credit goes to Hofstadter. In any case, I think people listened to Eliezer more because he said things like "I have worked out a mathematical analysis of these confusing problems", not just "my intuition says the basic assumptions of game theory don't sound right". If you do explore the implications of your alternative set of assumptions and they turn out to be interesting, you're exempt from mistake #1.

You observe a correlation between less arguing over assumptions and more interesting discussions/results, but isn't it possible that both are caused by higher intelligence

Intelligence is certainly a common factor, but I also observe that correlation by looking at myself at different times. If I get more interesting results when I view problems on their own terms, that strategy might work for other people too.

[-]Wei Dai14y60

At least some of the credit goes to Hofstadter.

The difference between Hofstadter and Eliezer is that Hofstadter couldn't make a convincing enough case for his assumptions, because he was talking about humans instead of AIs, and it's just not clear that human decision procedures are similar enough to each other for his assumptions to hold. Eliezer also thought his ideas applied to humans, but he had a backup argument to the effect "even if you don't think this applies to humans, at least it applies to AIs who know each others' source code, so it's still important to to work on" and that's what convinced me.

BTW, for historical interest, I found a 2002 post by Hal Finney that came pretty close to some of the ideas behind TDT:

I have a problem with this application of game theory to a situation where A and B both know that they are going to choose the same thing, which I believe is the case here. [...]

[...] They are two instances of the same deterministic calculation, with exactly the same steps being executed for both.

[...] And the best of the two possible outcomes is when both parties cooperate rather than defect.

I responded to Hal, and stated my agreement, but neither of us followed it up at the time. I even forgot about the post until I found it again yesterday, but I guess it must have influenced my thinking once Eliezer started talking about similar ideas.

[-]Wei Dai14y50

In any case, I think people listened to Eliezer more because he said things like "I have worked out a mathematical analysis of these confusing problems", not just "my intuition says the basic assumptions of game theory don't sound right".

Personally, I thought he made a good case that the basic assumptions of game theory aren't right, or rather won't be right in a future where superintelligent AIs know each others' source code. I don't think I would have been particularly interested if he just said "these non-standard assumptions lead to some cool math" since I don't have that much interest in math qua math.

Similarly, I explore other seemingly strange assumptions like the ones in Newcomb's Problem or Counterfactual Mugging because I think they are abstracted/simplified versions of real problems in FAI design and ethics, designed to isolate and clarify some particular difficulties, not because they are "interesting when taken on its own terms".

I guess it appears to you that you are working on these problems because they seem like interesting math, or "interesting when taken on its own terms", but I wonder why you find these particular math problems or assumptions interesting, and not the countless others you could choose instead. Maybe the part of your brain that outputs "interesting" is subconsciously evaluating importance and relevance?

[-]cousin_it14y130

I guess it appears to you that you are working on these problems because they seem like interesting math, or "interesting when taken on its own terms", but I wonder why you find these particular math problems or assumptions interesting, and not the countless others you could choose instead. Maybe the part of your brain that outputs "interesting" is subconsciously evaluating importance and relevance?

An even more likely explanation is that my mind evaluates reputation gained per unit of effort. Academic math is really crowded, chances are that no one would read my papers anyway. Being in a frustratingly informal field with a lot of pent-up demand for formality allows me to get many people interested in my posts while my mathematician friends get zero feedback on their publications. Of course it didn't feel so cynical from the inside, it felt more like a growing interest fueled by constant encouragement from the community. If "Re-formalizing PD" had met with a cold reception, I don't think I'd be doing this now.

[-]Wei Dai14y70

In that case you're essentially outsourcing your "interestingness" evaluation to the SIAI/LW community, and I think we are basing it mostly on relevance to FAI.

[-]cousin_it14y50

Yeah. Though that doesn't make me adopt FAI as my own primary motivation, just like enjoying sex doesn't make me adopt genetic fitness as my primary motivation.

[-]Wei Dai14y10

My point is that your advice isn't appropriate for everyone. People who do care about FAI or other goals besides community approval should think/argue about assumptions. Of course one could overdo that and waste too much time, but they clearly can't just work on whatever problems seem likely to offer the largest social reward per unit of effort.

Though that doesn't make me adopt FAI as my own primary motivation

What if we rewarded you for adopting FAI as your primary motivation? :)

[-]cousin_it14y00

What if we rewarded you for adopting FAI as your primary motivation? :)

That sounds sideways. Wouldn't that make the reward my primary motivation? =)

[-]Wei Dai14y40

No, I mean what if we offered you rewards for changing your terminal goals so that you'd continue to be motivated by FAI even after the rewards end? You should take that deal if we can offer big enough rewards and your discount rate is high enough, right? Previous related thread

[-]roystgnr14y30

You're trying to affect the motivation of a decision theory researcher by offering a transaction whose acceptance is itself a tricky decision theory problem?

Upvoted for hilarious metaness.

Now, all we need to do is figure out how humans can modify their own source code and verify those modifications in others...

[-]cousin_it14y00

That could work, but how would that affect my behavior? We don't seem to have any viable mathematical attacks on FAI-related matters except this one.

[-]torekp14y00

If you do explore the implications of your alternative set of assumptions and they turn out to be interesting, you're exempt from mistake #1.

I suggest editing the post to include this point.

[-]orthonormal14y10

In retrospect, someone definitely needed to post about this. Thanks for thinking of it!

[-]Pavitra14y00

I didn't want to name any names in this post because my status on LW puts me in a kinda position of power.

Have you considered using a different pseudonym for each post, and never decloaking any of the names?

The social implications of such a practice would interact with the existence of the anti-kibitzer, I think; it would amount to forcing others to experience (some) antikibitz features even if they had chosen not to. On the other hand, if the anti-kibitzer didn't already exist, I probably would have advocated writer-side over reader-side implementation/enforcement of pseudonymity.

[-]cousin_it14y00

I think people have more reasons to trust a post like this if they know it's coming from someone who got actual results. As for my more mathy posts, their style is probably so distinctive that using a pseudonym would be futile.

[-]John_Maxwell14y00

To what degree do these points apply to mathematics research in general?

[-]jsteinhardt14y00

I think they apply pretty widely. Or rather, the extent to which they apply to decision theory is roughly the extent they apply to mathematics in general.

[-]cousin_it14y00

I don't know, but would guess that the wider you try to apply them, the less correct they become.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

67

Common mistakes people make when thinking about decision theory

67

67

Mistake #1: Arguing about assumptions

Mistake #2: Stopping when your idea seems good enough

Mistake #2A: Stopping when your idea actually is good enough