Decisions are not about changing the world, they are about learning what world you live in

[-]Richard_Kennaway7y222

We know that physics does not support the idea of metaphysical free will. By metaphysical free will I mean the magical ability of agents to change the world by just making a decision to do so.

According to my understanding of the ordinary, everyday, non-magical meanings of the words "decide", "act", "change", etc., we do these things all the time. So do autonomous vehicles, for that matter. So do cats and dogs. Intention, choice, and steering the world into desired configurations are what we do, as do some of our machines.

It is strange that people are so ready to deny these things to people, when they never make the same arguments about machines. Instead, for example, they want to know what a driverless car saw and decided when it crashed, or protest that engine control software detected when it was under test and tuned the engine to misleadingly pass the emissions criteria. And of course there is a whole mathematical field called "decision theory". It's about decisions.

After all, what's the point of making decisions if you are just a passenger spinning a fake steering wheel not attached to any actual wheels?

The simile contradicts yo... (read more)

1Shmi7y

We perceive the world as if we were intentionally doing them, yes. But there is no "top-down causation" in physics that supports this view. And our perspective on agency depends on how much we know about the "agent": the more we know, the less agenty the entity feels. It's a known phenomenon. I mentioned it before a couple of times, including here and on my blog.

9Richard_Kennaway7y

"The sage is one with causation." The same argument that "we" do not "do" things, also shows that there is no such thing as a jumbo jet, no such thing as a car, not even any such thing as an atom; that nothing made of parts exists. We thought protons were elementary particles, until we discovered quarks. But no: according to this view "we" did not "think" anything, because "we" do not exist and we do not "think". Nobody and nothing exists. All that such an argument does is redefine the words "thing" and "exist" in ways that no-one has ever used them and no-one ever consistently could. It fails to account for the fact that the concepts work. You say that agency is bugs and uncertainty, that its perception is an illusion stemming from ignorance; I say that agency is control systems, a real thing that can be experimentally detected in both living organisms and some machines, and detected to be absent in other things.

7Shmi7y

and Actually, using the concepts that work is the whole point of my posts on LW, as opposed to using the concepts that feel right. I dislike the terms like "exist" as pointing to some objective reality, and this is where I part ways with Eliezer. To me it is "models all the way down." Here is another post on this topic from a few years back: Mathematics as a lossy compression algorithm gone wild. Once you consciously replace "true" with "useful" and "exist" with "usefully modeled as," a lot of confusion over what exists and what does not, what is true and what is false, what is knowledge and what it belief, what is objective and what is subjective, simply melts away. In this vein, it is very much useful to model a car as a car, not as a transient spike in quantum fields. In the same vein, it is useful to model the electron scattering through double slits as a transient spike in quantum fields, and not as a tiny ping-ping ball that can sometimes turn into a wave. I agree that a lot of agent-looking behavior can be usefully modeled as a multi-level control system, and, if anything, this is not done enough in biology, neuroscience or applied philosophy, if the latter is even a thing. By the same token, the control system approach is a useful abstraction for many observed phenomena, living or otherwise, not just agents. It does not lay claim to what an agent is, just what approach can be used to describe some agenty behaviors. I see absolutely no contradiction with what I said here or elsewhere. Maybe one way to summarize my point in this post is that modeling the decisions as learning about oneself and the world is more useful for making good decisions that modeling an agent as changing the world with her decisions.

6Richard_Kennaway7y

It seems to me that the concepts "jumbo jet", "car", and "atom" all work. If they "feel right", it is because they work. "Feeling right" is not some free-floating attribute to be bestowed at will on this or that. A telling phrase in the post you linked is "for some reason": Unless you can expand on that "some reason", this is just pushing under the carpet the fact that certain things work spectacularly well, and leaving Wigner's question unanswered. Thought and action are two different things, as different as a raven and a writing desk.

2Shmi7y

Will only reply to one part, to highlight our basic (ontological?) differences: A thought is a physical process in the brain, which is a part of the universe. An action is also a physical process in the universe, so it is very much like a thought, only more visible to those without predictive powers.

-1TAG7y

If choice and counterfactuals exist, then an action is something that can affect the future, while a thought is not. Of course, that difference no longer applies if your ontology doesn't feature choices and countefactuals... What your ontology should be is "nothing" or "mu". You are not keeping up to your commitments.

2Shmi7y

We seem to have very different ontologies here, and not converging. Also, telling me what my ontology "should" be is less than helpful :) It helps to reach mutual understanding before giving prescriptions to the other person. Assuming you are interested in more understanding, and less prescribing, let me try again to explain what I mean. In the view I am describing here "choice" is one of the qualia, a process in the brain. Counterfactuals is another, related, quale, the feeling of possibilities. Claiming anything more is a mind projection fallacy. The mental model of the world changes with time. I am not even claiming that time passes, just that there is a mental model of the universe, including the counterfactuals, for each moment in the observer's time. I prefer the term "observer" to agent, since it does not imply having a choice, only watching the world (as represented by the observer's mental model) unfold.

1TAG7y

And very different epistemologies. I am not the one denying the very possibility of knowing things about reality. All I am doing is taking you at your word. You keep saying that it is models all the way down, and there is no way to make true claims about reality. If I am not to take those comments literally, how am I to take them? How am I to guess the correct non-literal interpretation, out of the many possible ones.? That's an implicit claim about reality. Something can only be a a mind projection if there is nothing in reality corresponding to it. It is not sufficient to say that it is in the head or the model, it also has to not be in the territory, or else it is a true belief, not a mind projection.. To say that something doesn't exist in reality is to make a claim about reality as much as to say that something does. Again "in the model" does not imply "not in the territory".

1TAG7y

You seem happy enough with "not exist" as in "agents, counterfactuals and choices don't exist" If it is really possible for an agent to affect the future or street themselves into alternative futures, then there is a lot of potential utility in it, in that you can end up in a higher-utility future than you would otherwise have. OTOH, if there are no counterfactuals, then whatever utility you gain is predetermined. So one cannot assess the usefulness, in the sense of utility gain, of models, in a way independent of the metaphysics of determinism and counterfactuals. What is useful, and how useful is, depends on what is true. It contradicts the "agents don't exist thing" and the "I never talk about existence thing". If you only objective to reductively inexplicable agents, that would be better expressed as "there is nothing nonreductive". Although that still wouldn't help you come to the conclusion that there is no choice and no counterfactuals, because that is much more about determinism than reductionism.

2Shmi7y

Yep, some possible worlds have more utility for a given agent than others. And, yes, sort of. Whatever utility you gain is not your free choice, and not necessarily predetermined, just not under your control. You are a mere observer who thinks they can change the world. I don't see how. Seems there is an inferential gap there we haven't bridged.

1TAG7y

That's a statement about the world. Care to justify it?

1TAG7y

How do you know that the people who say "agents exist" don't mean "some systems can be usefully modelled as agents"? You are making a claim about reality, that counterfactuals don't exist., even though you are also making a meta claim that you don't make claims about reality. If probablistic agents[], and counterfactuals are both useful models (and I don't see how you can consistentlt assert the former and deny the latter) then counterfactuals "exist" by your* lights. [*] Or automaton, if you prefer. If someone builds a software gismo that is probablistic and acts without specific instruction, then it is an agetn and an automaton all at the same time.

1TAG7y

There is no full strength top-down determinism, but systems-level behaviour is enough to support a common-sense view of decision making.

2Shmi7y

I agree, the apparent emergent high-level structures look awfully like agents. That intentional stance tends to dissipate once we understand them more.

1TAG7y

If intentionality just mean seeking to pursue or maximise some goal, there is no reason an artificial system should not have it. But the answer is different if intentionality means having a ghost or homunculus inside. And neither is the same as the issue of whether an agent is deterministic , or capable of changing the future. More precision is needed.

1jessicata7y

Even when the agent has more compute than we do? I continue to take the intentional stance towards agents I understand but can't compute, like MCTS-based chess players.

2Shmi7y

What do you mean by taking the intentional stance in this case?

1jessicata7y

I would model the program as a thing that is optimizing for a goal. While I might know something about the program's weaknesses, I primarily model it as a thing that selects good chess moves. Especially if it is a better chess player than I am. See: Goal inference as inverse planning.

[-]Heighn4y110

Great post overall, you're making interesting points!

Couple of comments:

There are 8 possible worlds here, with different utilities and probabilities

your utility for "To smoke" and "No lesion, no cancer" should be 1,000,000 instead of 0
your utility for "Not to smoke" and "No lesion, no cancer" should be 1,000,000 instead of 0

Some decision theorists tend to get confused over this because they think of this magical thing they call "causality," the qualia of your decisions being yours and free, causing the world to change upon your metaphysical command. They draw fancy causal graphs like this one:

That seems like an unfair criticism of the FDT paper. Drawing such a diagram doesn't imply one believes causality to be magic any more than making your table of possible worlds.

Specifically, the diagrams in the FDT paper don't say decisions are "yours and free", at least if I understand you correctly. Your decisions are caused by your decision algorithm, which in some situations is implemented in other agents as well.

[-]Gordon Seidoh Worley7y80

This seems to cut through a lot of confusion present in decision theory, so I guess the obvious question to ask is why don't we already work things this way instead of the way they are normally approached in decision theory?

[-]jessicata7y190

To the extent that this approach is a decision theory, it is some variant of UDT (see this explanation). The problems with applying and formalizing it are the usual problems with applying and formalizing UDT:

How do you construct "policy counterfactuals", e.g. worlds where "I am the type of person who one-boxes" and "I am the type of person who two-boxes"? (This isn't a problem if the environment is already specified as a function from the agent's policy to outcome, but that often isn't how things work in the real world)
How do you integrate this with logical uncertainty, such that you can e.g. construct "possible worlds" where the 1000th digit of pi is 2 (when in fact it isn't)? If you don't do this then you get wrong answers on versions of these problems that use logical pseudorandomness rather than physical randomness.
How does this behave in multi-agent problems, with other versions of itself that have different utility functions? Naively both agents would try to diagonalize against each other, and an infinite loop would result.

4Shmi7y

Those are excellent questions! Thank you for actually asking them, instead of simply stating something like "What you wrote is wrong because..." Let me try to have a crack at them, without claiming that "I have solved decision theory, everyone can go home now!" "I am a one-boxer" and "I am a two-boxer" are both possible worlds, and by watching yourself work through the problem you learn in which world you live. Maybe I misunderstand what you are saying though. As of this moment, both are possible worlds for me. If I were to look up or calculate the 1000th digit of Pi, I would learn a bit more about the world I am in. Not including the lower-probability worlds like having calculating the result wrongly and so on. Or I might choose not to look it up, and both worlds would remain possible until and unless I gain, intentionally or accidentally (there is no difference, intentions and accidents are not a physical thing, but a human abstraction at the level of intentional stance), some knowledge about the burning question of the 1000th digit of Pi. Can you give an example of a problem "that uses logical pseudorandomness" where simply enumerating worlds would give a wrong answer? I am not sure in what way an agent that has a different utility function is at all yourself. An example would be good. My guess is that you might be referring to a Nash equilibrium that is a mixed strategy, but maybe I am wrong.

[-]jessicata7y110

“I am a one-boxer” and “I am a two-boxer” are both possible worlds, and by watching yourself work through the problem you learn in which world you live. Maybe I misunderstand what you are saying though.

The interesting formal question here is: given a description of the world you are in (like the descriptions in this post), how do you enumerate the possible worlds? A solution to this problem would be very useful for decision theory.

If an agent knows its source code, then "I am a one-boxer" and "I am a two-boxer" could be taken to refer to currently-unknown logical facts about what its source code outputs. You could be proposing a decision theory whereby the agent uses some method for reasoning about logical uncertainty (such as enumerating logical worlds), and selects the action such that its expected utility is highest conditional on the event that its source code outputs this action. (I am not actually sure exactly what you are proposing, this is just a guess).

If the logical uncertainty is represented by a logical inductor, then this decision theory is called "LIEDT" (logical inductor EDT) at MIRI, and it has a few problems, as explained in this p... (read more)

3Shmi7y

Thank you for your patience explaining the current leading edge and answering my questions! Let me try to see if my understanding of what you are saying makes sense. By "source code" I assume you mean the algorithm that completely determines the agent's actions for a known set of inputs, though maybe calculating these actions is expensive, hence some of them could be "currently unknown" until the algorithm is either analyzed or simulated. Let me try to address your points in the reverse order. ... Enumerating does not require simulating. It is descriptive, not prescriptive. So there are 4 possible worlds, 00, 01, 10 and 11, with rewards for player 1 being 0, 9, -1, 8, and for player 2 being 0, -1, 10, 9. But to assign prior probabilities to these worlds, we need to discover more about the players. For pure strategy players some of these worlds will be probability 1 and others 0. For mixed strategy players things get slightly more interesting, since the worlds are parameterized by probability: Let's suppose that player 1 picks each world with probabilities p and 1-p and player 2 with probabilities q and 1-q. Then the probabilities of each world are pq, p(1-q), (1-p)q and (1-p)(1-q). Then the expected utility for each world is for player 1: 0, 9p(1-q), -(1-p)q, 8(1-p)(1-q), and for player 2 0, -p(1-q), 10(1-p)q, 9(1-p)(1-q). Out of the infinitely many possible worlds there will be one with the Nash equilibrium, where each player is indifferent to which decision the other player ends up making. This is, again, purely descriptive. By learning more about what strategy the agents use, we can evaluate the expected utility for each one, and, after the game is played, whether once or repeatedly, learn more about the world the players live in. The question you posed is in tension with the whole idea of agents not being able to affect the world, only being able to learn about the world it lives in. There is no such thing as a WEDT agent. If one of the players is the type

9jessicata7y

OK, I misinterpreted you as recommending a way of making decisions. It seems that we are interested in different problems (as I am trying to find algorithms for making decisions that have good performance in a variety of possible problems). Re top down causation: I am curious what you think of a view where there are both high and low level descriptions that can be true at the same time, and have their own parallel causalities that are consistent with each other. Say that at the low level, the state type is L and the transition function is tl:L→L. At the high level, the state type is H and the nondeterministic transition function is th:H→Set(H), i.e. at a high-level sometimes you don't know what state things will end up in. Say we have some function f:L→H for mapping low-level states to high-level states, so each low-level state corresponds to a single high-level state, but a single high-level state may correspond to multiple low-level states. Given these definitions, we could say that the high and low level ontologies are compatible if, for each low level state l, it is the case that f(tl(l))∈th(f(l)), i.e. the high-level ontology's prediction for the next high-level state is consistent with the predicted next high-level state according to the low-level ontology and f. Causation here is parallel and symmetrical rather than top-down: both the high level and the low level obey causal laws, and there is no causation from the high level to the low level. In cases where things can be made consistent like this, I'm pretty comfortable saying that the high-level states are "real" in an important sense, and that high-level states can have other high-level states as a cause. EDIT: regarding more minor points: Thanks for the explanation of the multi-agent games; that makes sense although in this case the enumerated worlds are fairly low-fidelity, and making them higher-fidelity might lead to infinite loops. In counterfactual mugging, you have to be able to enumerate both t

2Shmi7y

Right. I would also be interested in the algorithms for making decisions if I believed we were agents with free will, freedom of choice, ability to affect the world (in the model where the world is external reality) and so on. Absolutely, once you replace "true" with "useful" :) We can have multiple models at different levels that make accurate predictions of future observations. I assume that in your notation tl:L→L is an endomorphism within a set of microstates L, and th:H→Set(H) is a map from a macrostate type H, (what would be an example of this state type?) to a wider set of macrostates (like what?). I am guessing that this may match up with the standard definitions of microstates and macrostates in statistical mechanics, and possibly some kind of a statistical ensemble? Anyway, so your statement is that of emergence: the evolution of microstates maps into an evolution of macrostates, sort of like the laws of statistical mechanics map into the laws of thermodynamics. In physics it is known as an effective theory. If so, I have no issue with that. Certainly one can call, say, gas compression by an external force as a cause of it absorbing mechanical energy and heating up. In the same sense, one can talk about emergent laws of human behavior, where a decision by an agent is a cause of change in the world the agent inhabits. So, a decision theory is an emergent effective theory where we don't try to go to the level of states L, be those at the level of single neurons, neuronal electrochemistry, ion channels opening and closing according to some quantum chemistry and atomic physics, or even lower. This seems to be a flavor of compatibilism. What I have an issue with is the apparent break of the L→H mapping when one postulates top-down causation, like free choice, i.e. multiple different H's reachable from the same microstate. I am confused about the low/high-fidelity. In what way what I suggested is low-fidelity? What is missing from the picture? Why would it b

9jessicata7y

My guess is that you, in practice, actually are interested in finding decision-relevant information and relevant advice, in everyday decisions that you make. I could be wrong but that seems really unlikely. Re microstates/macrostates: it seems like we mostly agree about microstates/macrostates. I do think that any particular microstate can only lead to one macrostate. By "low-fidelity" I mean the description of each possible world doesn't contain a complete description of the possible worlds that the other agent enumerates. (This actually has to be the case in single-person problems too, otherwise each possible world would have to contain a description of every other possible world) An issue with imagining a possible world where 1+1=3 is that it's not clear in what order to make logical inferences. If you make a certain sequence of logical inferences with the axiom 1+1=3, then you get 2=1+1=3; if you make a difference sequence of inferences, then you get 2=1+1=(1+1-1)+(1+1-1)=(3-1)+(3-1)=4. (It seems pretty likely to me that, for this reason, logic is not the right setting in which to formalize logically impossible counterfactuals, and taking counterfactuals on logical statements is confused in one way or another) If we fix a particular mental model of this world, then we can answer questions about this model; part of the decision theory problem is deciding what the mental model of this world should be, and that is pretty unclear.

2Shmi7y

Yes, if course I do, I cannot help it. But just because we do something doesn't mean we have the free will to either do or not do it. Right, I cannot imagine it being otherwise, and that is where my beef with "agents have freedom of choice" is. Since possible worlds are in the observer's mind (obviously, since math is a mental construction to begin with, no matter how much people keeps arguing whether mathematical laws are invented or discovered), different people may make a suboptimal inference in different places. We call those "mistakes". Most times people don't explicitly use axioms, though sometimes they do. Some axioms are more useful than others, of course. Starting with 1+1=3 in addition to the usual remaining set, we can prove that all numbers are equal. Or maybe we end up with a mathematical model where adding odd numbers only leads to odd numbers. In that sense, not knowing more about the world, we are indeed in a "low-fidelity" situation, with many possible (micro-)worlds where 1+1=3 is an axiom. Some of these worlds might even have a useful description of observations (imagine, for example, one where each couple requires a chaperone, there 1+1 is literally 3).

1TAG7y

In other words. usefulness (which DT to use) depends on truth (Which world model to use).

1TAG7y

If there is indeterminism at the micro level , there is not the slightest doubt that it can be amplified to the macro level, because quantum mechanics as an experimental science depends on the ability to make macroscopic records of events involving single particles.

2Shmi7y

Amplifying microscopic indeterminism is definitely a thing. It doesn't help the free choice argument though, since the observer is not the one making the choice, the underlying quantum mechanics does.

1TAG7y

Macroscopic indeterminism is sufficient to establish real, not merely logical, counterfactuals. Besides that, It would be helpful to separate the ideas of dualism , agency and free choice. If the person making the decision is not some ghost in the machine, then they the only thing they can be is the machine, as a total system,. In that case, the question becomes the question of whether the system as a whole can choose, could have chosen otherwise, etc. But you're in good company: Sam Harris is similarly confused.

2Shmi7y

Not condescending in the least :P There are no "real" counterfactuals, only the models in the observer's mind, some eventually proven better reflecting observations than others. It would be helpful, yes, if they were separable. Free choice as anything other than illusionism is tantamount to dualism.

1TAG7y

You need to argue for that claim, not just state it. The contrary claim is supported by a simple argument: if an even is indeterministic, it need not have happened, or need not have happened that way. Therefore, there is a real possibility that it did not happened, or happened differently -- and that is a real counterfactual. You need to argue for that claim as well.

2Shmi7y

There is no such thing as "need" in Physics. There are physical laws, deterministic or probabilistic, and that's it. "Need" is a human concept that has no physical counterpart. Your "simple argument" is an emotional reaction.

-3TAG7y

Your comment has no relevance, because probablistic laws automatically imply counterfactuals as well. In fact it's just another way of saying the same thing. I could have shown it in modal logic, too.

2Shmi7y

Well, we have reached an impasse. Goodbye.

2Shmi7y

Thank you, I am glad that I am not the only one for whom causation-free approach to decision theory makes sense. UDT seems a bit like that.

[-]Dacyn7y60

I note here that simply enumerating possible worlds evades this problem as far as I can tell.

The analogous unfair decision problem would be "punish the agent if they simply enumerate possible worlds and then choose the action that maximizes their expected payout". Not calling something a decision theory doesn't mean it isn't one.

2Shmi7y

Please propose a mechanism by which you can make an agent who enumerates the worlds seen as possible by every agent, no matter what their decision theory is, end up in a world with lower utility than some other agent.

4dxu7y

Say you have an agent A who follows the world-enumerating algorithm outlined in the post. Omega makes a perfect copy of A and presents the copy with a red button and a blue button, while telling it the following: "I have predicted in advance which button A will push. (Here is a description of A; you are welcome to peruse it for as long as you like.) If you press the same button as I predicted A would push, you receive nothing; if you push the other button, I will give you $1,000,000. Refusing to push either button is not an option; if I predict that you do not intend to push a button, I will torture you for 3^^^3 years." The copy's choice of button is then noted, after which the copy is terminated. Omega then presents the real agent facing the problem with the exact same scenario as the one faced by the copy. Your world-enumerating agent A will always fail to obtain the maximum $1,000,000 reward accessible in this problem. However, a simple agent B who chooses randomly between the red and blue buttons has a 50% chance of obtaining this reward, for an expected utility of $500,000. Therefore, A ends up in a world with lower expected utility than B. Q.E.D.

4Said Achmiz7y

Your scenario is somewhat ambiguous, but let me attempt to answer all versions of it that I can see. First: does the copy of A (hereafter, A′) know that it’s a copy? If yes, then the winning strategy is “red if I am A, blue if I am A′”. (Or the reverse, of course; but whichever variant A selects, we can be sure that A′ selects the same one, being a perfect copy and all.) If no, then indeed A receives nothing, but then of course this has nothing to do with any copies; it is simply the same scenario as if Omega predicted A’s choice, then gave A the money if A chose differently than predicted—which is, of course, impossible (Omega is a perfect predictor), and thus this, in turn, is the same as “Omega shows up, doesn’t give A any money, and leaves”. Or is it? You claim that in the scenario where Omega gives the money iff A chooses otherwise than predicted, A could receive the money with 50% probability by choosing randomly. But this requires us to reassess the terms of the “Omega, a perfect predictor” stipulation, as previously discussed by cousin_it. In any case, until we’ve specified just what kind of predictor Omega is, and how its predictive powers interact with sources of (pseudo-)randomness—as well as whether, and how, Omega’s behavior changes in situations involving randomness—we cannot evaluate scenarios such as the one you describe.

6Dacyn7y

dxu did not claim that A could receive the money with 50% probability by choosing randomly. They claimed that a simple agent B that chose randomly would receive the money with 50% probability. The point is that Omega is only trying to predict A, not B, so it doesn't matter how well Omega can predict B's actions. The point can be made even more clear by introducing an agent C that just does the opposite of whatever A would do. Then C gets the money 100% of the time (unless A gets tortured, in which case C also gets tortured).

2Said Achmiz7y

This doesn’t make a whole lot of sense. Why, and on what basis, are agents B and C receiving any money? Are you suggesting some sort of scenario where Omega gives A money iff A does the opposite of what Omega predicted A would do, and then also gives any other agent (such as B or C) money iff said other agent does the opposite of what Omega predicted A would do? This is a strange scenario (it seems to be very different from the sort of scenario one usually encounters in such problems), but sure, let’s consider it. My question is: how is it different from “Omega doesn’t give A any money, ever (due to a deep-seated personal dislike of A). Other agents may, or may not, get money, depending on various factors (the details of which are moot)”? This doesn’t seem to have much to do with decision theories. Maybe shminux ought to rephrase his challenge. After all— … can be satisfied with “Omega punches A in the face, thus causing A to end up with lower utility than B, who remains un-punched”. What this tells us about decision theories, I can’t rightly see.

5dxu7y

Yes, this is correct, and is precisely the point EYNS was trying to make when they said "Omega doesn't give A any money, ever (due to a deep-seated personal dislike of A)" is a scenario that does not depend on the decision theory A uses, and hence is an intuitively "unfair" scenario to examine; it tells us nothing about the quality of the decision theory A is using, and therefore is useless to decision theorists. (However, formalizing this intuitive notion of "fairness" is difficult, which is why EYNS brought it up in the paper.) I'm not sure why shminux seems to think that his world-counting procedure manages to avoid this kind of "unfair" punishment; the whole point of it is that it is unfair, and hence unavoidable. There is no way for an agent to win if the problem setup is biased against them to start with, so I can only conclude that shminux misunderstood what EYNS was trying to say when he (shminux) wrote

2Said Achmiz7y

I didn’t read shminux’s post as suggesting that his scheme allows an agent to avoid, say, being punched in the face apropos of nothing. (And that’s what all the “unfair” scenarios described in the comments here boil down to!) I think we can all agree that “arbitrary face-punching by an adversary capable of punching us in the face” is not something we can avoid, no matter our decision theory, no matter how we make choices, etc.

5Dacyn7y

I am not sure how else to interpret the part of shminux's post quoted by dxu. How do you interpret it?

2Shmi7y

It seems to be a good summary of what dxu and Dacyn were suggesting! I think it preserves the salient features without all the fluff of copying and destroying, or having multiple agents. Which makes it clear why the counterexample does not work: I said "the worlds seen as possible by every agent, no matter what their decision theory is," and the unpunched world is not a possible one for the world enumerator in this setup. My point was that CDT makes a suboptimal decision in Newcomb, and FDT struggles to pick the best decision in some of the problems, as well, because it is lost in the forest of causal trees, or at least this is my impression from the EYNS paper. Once you stop worrying about causality and the agent's ability to change the world by their actions, you end up with a simper question "what possible world does this agent live in and with what probability?"

3Dacyn7y

A mind-reader looks to see whether this is an agent's decision procedure, and then tortures them if it is. The point of unfair decision problems is that they are unfair.

2philh7y

Can you clarify this? One interpretation is that you're talking about an agent who enumerates every world that any agent sees as possible. But your post further down seems to contradict this, "the unpunched world is not a possible one for the world enumerator". And it's not obvious to me that this agent can exist. Another is that the agent enumerates only the worlds that every agent sees as possible, but that agent doesn't seem likely to get good results. And it's not obvious to me that there are guaranteed to be any worlds at all in this intersection. Am I missing an interpretation?

[-]Said Achmiz7y60

Great post!

I have a question, though, about the “adversarial predictor” section. My question is: how is world #3 possible? You say:

Agent uses DT1 when rewarded for using DT1 and DT2 when rewarded for using DT2

However, the problem statement said:

Imagine I have a copy of Fiona, and I punish anyone who takes the same action as the copy.

Are we to suppose that the copy of Fiona that the adversarial predictor is running does not know that an adversarial predictor is punishing Fiona for taking certain actions, but that the actual-Fiona does know this, ... (read more)

2Shmi7y

One would have to ask Eliezer and Nat what they really meant, since it is easy to end up in a self-contradictory setup or to ask a question about an impossible world, like to asking what happens if in the Newcomb's setup the agent decided to switch to two-boxing after the perfect predictor had already put $1,000,000 in. My wild guess is that the FDT Fiona from the paper uses a certain decision theory DT1 that does not cope well with the world with adversarial predictors. She uses some kind of causal decision graph logic that would lead her astray instead of being in the winning world. I also assume that Fiona makes her "decisions" while being fully informed about the predictor's intentions to punish her and just CDT-like throws her hands in the air and cries "unfair!"

[-]BurntVictory7y40

Hey, noticed what might be errors in your lesion chart: No lesion, no cancer should give +1m utils in both cases. And your probabilities don't add to 1. Including p(lesion) explicitly doesn't meaningfully change the EV difference, so eh. However, my understanding is that the core of the lesion problem is recognizing that p(lesion) is independent of smoking; EYNS seems to say the same. Might be worth including it to make that clearer?

(I don't know much about decision theory, so maybe I'm just confused.)

[-]Chris_Leong7y*40

Assuming that an agent who doesn't have the lesion gains no utility from smoking OR from having cancer changes the problem.

But apart from that, this post is pretty good at explaining how to approach these problems from the perspective of Timeless Decision Theory. Worth reading about it if you aren't familiar.

Also, is generally agreed that in a deterministic world we don't really make decisions as per libertarian free will. The question is then how to construct the counterfactuals for the decision problem. I'm in agreement with you TDT is much more consistent as the counterfactuals tend to describe actually consistent worlds.

[-]SMK3y*30

From Arif Ahmed's Evidence, Decision and Causality (ch. 5.4, p. 142-143; links mine):

Deliberating agents should take their choice to be between worlds that differ over the past as well as over the future. In particular, they differ over the effects of the present choice but also over its unknown causes. Typically these past differences will be microphysical differences that don’t matter to anyone. But in Betting on the Past they matter to Alice.

. . .

On this new picture, which arises naturally from [evidential decision theory]. . ., it is misleading t

... (read more)

[-]romeostevensit7y30

I'm slightly confused. Is it that we're learning about which world we are in or, given that counterfactuals don't actually exist, are we learning what our own decision theory is given some stream of events/worldline?

2Shmi7y

What is the difference between the two? The world includes the agent, and discovering more about the world implies self-discovery

[-]TAG7y10

The compatibilist concept of free will is practical. It tells you under which circumstances someone can be held legally or ethically responsible. It does not require global additions about how the laws of the universe work. Only when compatibilist free will is asserted as being the only kind does it become a metaphysical claim, or rather an anti metaphysical one. The existence of compatibilist free will isn't worth arguing about: it's designed to be compatible with a wide variety of background assumptions.

Magical, or "counter causal" free will is... (read more)

3Shmi7y

Yep, no qualms there. it is definitely the pragmatic approach that works in the usual circumstances. The problem arise when you start exploring farther from the mainstream, where your intuition fails, like the Newcomb's problem. I don't really understand the rest of your point. The libertarian free will "our choices are free from the determination or constraints of human nature and free from any predetermination by God" is pure magical thinking not grounded in science. There is no difference between determinism and chance in that sense, neither is top-down causation. Scott Aaronson suggested the Knightian free bit option as a source of true unpredictability, which seems to be an inherent requirement for the libertarian free will not based on magic. Being in a simulation is an old standby, of course. In what way?

3TAG7y

Perhaps I should have been clearer that complete determinism versus indeterminism is an open question in science . But then maybe you knew, because your you made a few references to indeterminism already. And maybe you knew because the issue is crucial to the the correct interpration of QM , which is discussed interminably here. You hint very briefly at the he idea that randomness doesn't support libertarian FW, but that is an open question in philosophy. It has been given book-length treatments -: James,Kane Belaguer, etc. Which? Is indeterminism incapable of supporting FW as stated in the first quote , or capable as in the second? But that is slightly beside the point, since you are arguing against counterfactuals, and the existence of counterfactuals follows tatologously from the absence of strict determinism, questions of free will aside

[-]TAG7y10

We know that physics does not support the idea of metaphysical free will. By metaphysical free will I mean the magical ability of agents to change the world by just making a decision to do so. To the best of our knowledge, we are all (probabilistic) automatons who think themselves as agents with free choice

If a probablistic agent can make a decision that is not fully determined by previous events, then the consequences of that decision trace back to the agent, as a whole system, and no further. That seems to support a respectable enough version of "... (read more)

2Shmi7y

Yes, if that view were supported by evidence, that would count as free will. Thus far, whenever we gain the tools to look further, we can trace the consequences further back, with no clear boundary in sight, beyond the inherent randomness of the ion channels in the neurons firing according to a suitable Markov chain model.

1TAG7y

Well, which? Iron chains of causality stretching back to infinity, or inherent randomness? You may be taking it as obvious that both randomness and determinism exclude (soem version of ) free will, but that needs to be spelt out.

2Shmi7y

Scott Aaronson in The Ghost in the Quantum Turing Machine does a good job spelling all this out. There is no physical distinction between an agent and a non-agent.

3TAG7y

Scott Aaronson in The Ghost in the Quantum Turing Machine uses the word "agent" 37 times. The building of agents is an engineering discipline. Much of the discussion on this board is about AIs' which are agentive as well as intelligent. You might mean there is no fundamental difference between an agent and a non-agent. But then you need to show that someone, somewhere has asserted that, rather than using the word "agent" merely as a "useful" way of expressing something non-fundamental. More precision is needed.

[-]Dacyn7y10

Again, this is just a calculation of expected utilities, though an agent believing in metaphysical free will may take it as a recommendation to act a certain way.

Are you not recommending agents to act in a certain way? You are answering questions from EYNS of the form "Should X do Y?", and answers to such questions are generally taken to be recommendations for X to act in a certain way. You also say things like "The twins would probably be smart enough to cooperate, at least after reading this post" which sure sounds like a recommendation of cooperation (if they do not cooperate, you are lowering their status by calling them not smart)

2Shmi7y

I have mentioned in the title and in the first part that I do not subscribe to the idea of the metaphysical free will. Sure, subjectively it feels like "recommending" or "deciding" or "acting," but there is no physical basis for treating it as actually picking one of the possible worlds. What feels like making a decision and seeing the consequences is nothing but discovering which possible world is actual. Internally and externally.j "smart" is a statement about the actual world containing the twins, and if intelligence corresponds to status in that world, then making low-utility decisions would correspond to low status. In general, I reject the intentional stance in this model. Paradoxically, it results in better decision making for those who use it to make decisions.

1Dacyn7y

My point was that intelligence corresponds to status in our world: calling the twins not smart means that you expect your readers to think less of them. If you don't expect that, then I don't understand why you wrote that remark. I don't believe in libertarian free will either, but I don't see the point of interpreting words like "recommending" "deciding" or "acting" to refer to impossible behavior rather than using their ordinary meanings. However, maybe that's just a meaningless linguistic difference between us.

2Shmi7y

I can see why you would interpret it this way. That was not my intention. I don't respect Forrest Gumps any less than Einsteins.

1Dacyn7y

You don't harbor any hopes that after reading your post, someone will decide to cooperate in the twin PD on the basis of it? Or at least, if they were already going to, that they would conceptually connect their decision to cooperate with the things you say in the post?