I think this argument doesn't deserve anywhere near as much thought as you've given it. Caplan is committing a logical error, nothing else.
He probably reasoned as follows:
If determinism is true, I am computable.
Therefore, a large enough computer can compute what I will say.
Since my reaction is just more physics, those should be computable as well, hence it should also be possible to tell me what I will do after hearing the result.
This is wrong because "what Caplan outputs after seeing the prediction of our physics simualtor" is a system larger than the physics simulator and hence not computable by the physics simulator. Caplan's thought experiment works as soon as you make it so the physics simulator is not causally entangled with Caplan.
I don't think fixed points have any place in this analysis. Obviously, Caplan can choose to implement a function without a fixed point, like (edit: rather ), in fact he's saying this in the comment you quoted. The question is why he can do this, since (as by the above) he supposedly can't.
See also my phrasing of this problem and Richard_Kennaway's answer. I think the real problem with his quote is that it's so badly phrased that the argument isn't even explicit, which paradoxically makes it harder to refute. You first have to reconstruct the argument, and then it gets easier to see why it's wrong. But I don't think there's anything interesting there.
I agree that Caplan is actually committing a logical error, but I don't think the way I present the argument is uninteresting. I don't think Caplan actually considered any of what I bring up in the post.
If you don't make the physics simulator causally entangled with Caplan then his argument doesn't work at all. It's true that you don't run into these problems of a simulator having to simulate a larger system than itself, but on the other hand the compelling first-hand intuition that "I could just do the opposite of what I'm told" melts away once you're not actually being told what you will do. That's why the causal entanglement is crucial to the argument.
Also, is a function with a fixed point, namely . I assume your minus sign is meant to be a logical complement?
Right, I was just thinking negation, but yeah the formalization doesn't make a lot of sense. (Also, there was some frustration with Caplan bleeding into my comment, apologies.)
If you don't make the physics simulator causally entangled with Caplan then his argument doesn't work at all.
Yes, because it doesn't work at all!
Bryan is saying he can avoid fixed points. This is trivially true in the deterministic setting, and your post is asking whether it's also true in the probabilistic setting. But it doesn't matter whether it's true in the probabilistic setting! The entanglement argument shows that even if you have a function without fixed points, this still doesn't give you LFW.
(I think you got confused in the post and thought that having fixed points shows LFW, but if anything it's the opposite; Bryan's point is that he can avoid fixed points. So the deterministic setting is the more favorable one for his argument, and since it doesn't work there, it just doesn't work period.)
(I think you got confused in the post and thought that having fixed points shows LFW, but if anything it's the opposite; Bryan's point is that he can avoid fixed points. So the deterministic setting is the more favorable one for his argument, and since it doesn't work there, it just doesn't work period.)
Not at all. Having fixed points proves LFW wrong, not right.
The whole point of my post is that LFW advocates would say that they can avoid fixed points, while if some theory such as hard determinism or compatibilism is correct then this argument shows that there's a situation in which you can't avoid fixed points.
Bryan is saying he can avoid fixed points. This is trivially true in the deterministic setting, and your post is asking whether it's also true in the probabilistic setting. But it doesn't matter whether it's true in the probabilistic setting! The entanglement argument shows that even if you have a function without fixed points, this still doesn't give you LFW.
The point of my post is not that not having fixed points implies LFW, it's that LFW implies not having fixed points. Obviously there are other arguments for why you would not have fixed points, e.g. might fail to be continuous.
Not at all. Having fixed points proves LFW wrong, not right.
Okay; I got the impression that you had it backward from the post, one reason was this quote:
Fortunately for the argument, the assumption of continuity for g is plausible in some real-world settings
Where I thought "the argument" refers to Caplan, but continuity is bad news for Caplan, not good news.
Anyway, if your point is that showing the existence of fixed points would disprove LFW (the contrapositive of LFW -> no fixed points], then I take back what I said about your post not being relevant for LFW. However, I maintain that Caplan's argument has no merit either way.
I think we agree that Caplan's argument has no merit. As I've said in another comment, the reason I quote Caplan is that his remark in the podcast is what prompted me to think about the subject. I don't attribute any of my arguments to Caplan and I don't claim that his argument makes sense the way he conceived of it.
I see a gap in the argument at the point where you "give" the agent a probability distribution, and it "gets" from that a probability distribution to use to make its choice. A probability distribution is an infinite object, in this case an assignment of a real number to each element of . You can only give such a thing to a finite automaton by streaming it as a string of bits that progressively give more and more information about the distribution, approaching it in the limit. The automation must calculate from this a stream of bits that similarly represent its probability distribution, each bit of output depending on only a finite part of the input. However, it cannot use the distribution to calculate its action until it has the exact values, which it never will. (That is generally true even if the automaton is not calculating based on an input distribution. In practice, of course, one just uses the computer's IEEE 754 "reals" and hopes the result will be close enough. But that will not do for Mathematics.)
If it is going to repeatedly choose actions from the same distribution, then I imagine there may be ways for it to sample from increasingly good finite approximations to the distribution in such a way that as it approximates it more and more precisely, it can correct earlier errors by e.g. sampling an action with a little higher probability than the distribution specifies, to correct for too low an approximation used earlier. But for a single choice, a finite automaton cannot sample from a probability distribution specified as a set of real numbers, unless you allow it a tolerance of some sort.
Furthermore, to do these calculations, some representation of the real numbers must be used that makes the operations computable. The operations must be continuous functions, because all computable functions on real numbers are continuous. This generally means using an ambiguous representation of the reals, that gives more than one representation to most real numbers. Each real is represented by an infinite string drawn from some finite alphabet of "digits". However, the topology on the space of digit-strings in this context has to be the one whose open sets are all the strings sharing any given finite prefix. These are also the closed sets, making the space totally disconnected. Brouwer's fixed point theorem therefore does not apply. There are trivial examples of continuous functions without fixed points, e.g. permute the set of digits. Such a function may not correspond to any function on the reals, because multiple representations of the same real number might be mapped to representations of different real numbers.
I think this is actually not as serious of a problem as you make it out to be, for various reasons.
Finite automatons don't actually exist in the real world. This was my whole point about (for example) transistors not actually working as binary on/off switches. In the real world there are ways to give agents access to real numbers, just with some measurement error both on the part of the forecaster and on the part of the agent reading the input. The setup of giving a wave-function input to a QTM is where this works best: you can't get the squared norms of the coefficients exactly right, but you can get them right "in probability", in the sense that an error of will have a probability of , etc.
Given that you don't have a continuous function but (say) a convex combination of a continuous function with some random noise, Brouwer's fixed point theorem still tells you that the continuous part has a fixed point, and then the random noise added on top of it just throws you off by some small distance. In other words, there isn't a catastrophic failure where a little bit of noise on top of a continuous process totally throws you off; continuous function + some noise on top still has an "almost fixed point" with probability .
I agree with you that if you had a finite automaton then this whole trick doesn't work. The relevant notion of continuity for Turing machines is continuity on the Cantor set, which must hold since a TM that halts can only read finitely many bits of the input and so can only figure out it's in some nonempty open subset of the Cantor set. However, as you point out the Cantor set is totally disconnected and there's no way to map that notion of continuity to the one that uses real numbers.
In contrast, quantum Turing machines that take inputs from a Hilbert space are actually continuous in the more traditional real or complex analytic sense of that term, so here we have no such problems.
I feel like a lot of the angst about free will boils down to conflicting intuitions.
The way to reconcile these intuitions is to recognize that yes, all the decisions you make are in a sense predetermined, but a lot of what is determining those decisions is who you are and what sort of thing you would do in a particular circumstance. You are making decisions, that experience is not invalidated by a fully deterministic universe. It's just that you are who you are and you'll make the decision that you would make.
This is the most interesting assessment of free will I've seen in a long time. I've thought about this a great deal before, particularly when trying to work out my feelings on this post. Obviously we shouldn't expect any predictor to also be able to predict itself, for halting problem reasons. But yeah, why couldn't it shop around to find a way of telling the prediction which preserves the outcome? Eventually I decided that probably there was something like that, but it would look a lot more like coercion than prediction. I, for instance, could probably predict a persons actions fairly well if I was holding them at gunpoint. Omega could do the same by digging through possible sentences until it worked out how to tell you what you'd do without swaying the odds.
I agree with e.g. Rafael that Caplan's argument is just silly and wrong and doesn't deserve this much analysis.
But: I don't see how any version of this could possibly be the thing that makes Bryan Caplan say "oh, silly me, obviously my argument was no good". It doesn't amount to saying that there must be a way of saying "you will raise your hand" or "you will not raise your hand" that guarantees that he will do what you say. It's more like this: if you say "the probability that you will raise your hand is p" and he responds by raising his hand with probability q, and if q is a continuous function of p, then there is a choice of p for which q=p. True, but again: can you imagine this observation being convincing to him? How? Won't he just say "well, duh, you replaced my thought experiment with a different one and it came out differently; so what?"?
Of course the argument may not be convincing to Caplan specifically, since I don't have that good of a model of how his mind works. I don't see why this matters, though.
For some reason people seem to be reading my post as an attempt to say Caplan's argument is actually good, when I'm just saying that Caplan's argument is what got me thinking about this issue. The rest of what I write is my argument and doesn't have much to do with what Caplan may or may not have thought.
As for the argument being convincing to libertarian free will advocates generally, to me the existence of such a probabilistic oracle that can inform you of your own actions in advance seems like a solid argument against it. There's no way in which this oracle can be well-calibrated against someone who deliberately chooses to mess with it. For example, if your action space has two elements and , then if the oracle gives you picking one of them >= 75% chance you pick the other one, and if it gives both options > 25% chance you just always pick . Even a cursory statistical analysis of that data will show that the oracle is failing to predict your behavior correctly.
If you can't even implement such a simple strategy to beat this oracle, what grounds do we have for saying that you have libertarian free will? It seems like the whole concept becomes incoherent if you will say that this oracle can correctly anticipate your behavior no matter how incentivized you are to try to beat it, and yet you actually "can choose otherwise".
being continuous does not appear to actually help to resolve predictor problems, as a fixed point of 50%/50% left/right^{[1]} is not excluded, and in this case Omega has no predictive power over the outcome ^{[2]}.
If you try to use this to resolve the Newcomb problem, for instance, you'll find that an agent that simply flips a (quantum) coin to decide does not have a fixed point in , and does have a fixed point in , as expected, but said fixed point is 50%/50%... which means the Omega is wrong exactly half the time. You could replace the Omega with an inverted Omega or a fair coin. They all have the same predictive power over the outcome - i.e. none.
Is there e.g. an additional quantum circuit trick that can resolve this? Or am I missing something?
Or one-box/two-box, etc.
For the same reason that a fair coin has no predictive power over a (different) fair coin.
You're overcomplicating the problem. If Omega predicts even odds on two choices and then you always pick one you've determined in advance, it will be obvious that Omega is failing to predict your behavior correctly.
Imagine that you claim to be able to predict the probabilities of whether I will choose left or right, and then predict 50% for both. If I just choose "left" every time then obviously your predictions are bad - you're "well calibrated" in the sense that 50% of your 50% predictions come true, but the model that both of my choices have equal probability will just be directly rejected.
In contrast, if this is actually a fixed point of , I will choose left about half the time and right about half the time. There's a big difference between those two cases and I can specify an explicit hypothesis test with -values, etc. if you want, though it shouldn't be difficult to come up with your own.
If Omega predicts even odds on two choices and then you always pick one you've determined in advance
(I can't quite tell if this was intended to be an example of a different agent or if it was a misconstrual of the agent example I gave. I suspect the former but am not certain. If the former, ignore this.) To be clear: I meant an agent that flips a quantum coin to decide at the time of the choice. This is not determined, or determinable, in advance^{[1]}. Omega can predict here fairly easily, but not .
There's a big difference between those two cases
Absolutely. It's the difference between 'average' predictive power over all agents and worst-case predictive power over a single agent, give or take, assuming I am understanding you correctly.
If Omega predicts even odds on two choices and then you always pick one you've determined in advance, it will be obvious that Omega is failing to predict your behavior correctly.
Ah. I think we are thinking of two different variants of the Newcomb problem. I would be interested in a more explicit definition of your Newcomb variant. The original Newcomb problem does not allow the box to contain odds, just a binary money/no money.
I agree that if Omega gives the agent the odds explicitly it's fairly simple for an agent to contradict Omega.
I was assuming that Omega would treat the fixed point instead as a mixed strategy (in the game theory sense) - that is if Omega predicted, say, 70/30 was a fixed point, Omega would roll a d10^{[2]} and 70% of the time put the money in.
This "works", in the sense of satisfying the fixedpoint, and in this case results in Omega having... nowhere near perfect predictive power versus the agent, but some predictive power at least (58% correct, if I calculated it correctly; ).
If the fixedpoint was 50/50 however, the fixedpoint is still satisfied by Omega putting in the money 50% of the time, but Omega is left with no predictive power over the outcome .
To the best of our knowledge, anyway.
Read 'private random oracle'.
It's intended to be an example of a different agent. I don't care much about the Newcomb problem setting since I think it's not relevant in this context.
My point is that the fact that Omega can guess 50/50 in the scenario I set up in the post doesn't allow it to actually have good performance and it's easy to tell this by running any standard hypothesis test. So I don't get how your Newcomb setup relates to my proposed setup in the post.
My point is that the fact that Omega can guess 50/50 in the scenario I set up in the post doesn't allow it to actually have good performance
That... is precisely my point? See my conclusion:
If the fixedpoint was 50/50 however, the fixedpoint is still satisfied by Omega putting in the money 50% of the time, but Omega is left with no predictive power over the outcome .
(Emphasis added.)
Then I don't understand what you're trying to say here. Can you explain exactly what you think the problem with my setup is?
I think you're just not understanding what I'm trying to say. My point is that if Omega actually has no knowledge, then predicting 50/50 doesn't allow him to be right. On the other hand, if 50/50 is actually a fixed point of , then Omega predicting 50/50 will give him substantial predictive power over outcomes. For example, it will predict that my actions will be roughly split 50/50, when there's no reason for this to be true if Omega has no information about me; I could just always pick an action I've predetermined in advance whenever I see a 50/50 prediction.
Alright, we are in agreement on this point.
I have a tendency to start on a tangential point, get agreement, then show the implications for the main argument. In practice a whole lot more people are open to changing their minds in this way than directly. This may be somewhat less important on this forum than elsewhere.
You stated:
In contrast, I think almost all proponents of libertarian free will would agree that their position predicts that an agent with such free will, such as a human, could always just choose to not do as they are told. If the distribution they are given looks like it's roughly uniform they can deterministically pick one action, and if it looks like it's very far from uniform they can just make a choice uniformly at random. The crux is that the function this defines can't be continuous, so I believe this forces advocates of libertarian free will to the position that agents with free will must represent discontinuous input-output relations.
(Emphasis added.)
One corollary of your conclusion is that it would imply that a continuous implies a lack of free will. - or in this case .
However, I've shown a case where a continuous nevertheless results in Omega having zero predictive power over the agent:
If the fixedpoint was 50/50 however, the fixedpoint is still satisfied by Omega putting in the money 50% of the time, but Omega is left with no predictive power over the outcome .
This then either means that:
I don't know why you're talking about the Newcomb problem again. I've already said I don't see how that's relevant. Can you tell me how, in my setup, the fixed point being 50/50 means the oracle has no predictive power over the agent?
If 50/50 is a fixed point then the agent clearly has predictive power, just like we have predictive power over what happens if you measure a qubit in the state . 50% doesn't imply "lack of predictive power".
My view of this is that Caplain (and likely the poster) are likely confused about what it means for physics to "predict" something. Assuming something that looks vaguely like a multiverse interpretation is true, a full prediction is across the full set of downstream universes, not a single possible downstream universe out of the set.
From my standpoint, the only reason the future appears to be unpredictable is because we have this misguided notion that there is only "one" future, the one we will find ourselves in. If the reality is that we're simultaneously in all of those possible futures, then a making a comprehensive future prediction has to contain all of them, and by containing all of them the prediction will be exact.
I think this is interesting because our understanding of physics seems to exclude effects that are truly discontinuous.
This is not true. An electron and a positron will, or will not, annihilate. They will not half-react.
For example, real-world transistors have resistance that depends continuously on the gate voltage
This is incorrect. It depends on the # of electrons, which is a discrete value. It's just that most of the time transistors are large enough that it doesn't really matter. That being said, it's absolutely important for things like e.g. flash memory. Modern flash memory cell might have ~400 electrons per cell^{[1]}.
This is not true. An electron and a positron will, or will not, annihilate. They will not half-react.
The Feynman diagrams of that process give you a scattering amplitude which will tell you the rate at which that process is going to occur. The probability of it occuring will be a continuous as a function on the Hilbert space.
This is incorrect. It depends on the # of electrons, which is a discrete value. It's just that most of the time transistors are large enough that it doesn't really matter. That being said, it's absolutely important for things like e.g. flash memory. Modern flash memory cell might have ~400 electrons per cell[1].
In quantum mechanics, even if the states of a system are quantized/discrete, the probability of you being in those states behaves continuously under unitary evolution or collapse.
You can't get around the continuity of unitary time evolution in QM with these kinds of arguments.
Sorry, are we talking about effects that are continuous, or effects that are discontinuous but which have probability distributions which are continuous?
I was rather assuming you meant the former considering you said 'effects that are truly discontinuous.'.
Both of your responses are the latter, not the former, assuming I am understanding correctly.
*****
You can't get around the continuity of unitary time evolution in QM with these kinds of arguments.
And now we're into the measurement problem, which far better minds than mine have spent astounding amounts of effort on and not yet resolved. Again, assuming I am understanding correctly.
Sorry, are we talking about effects that are continuous, or effects that are discontinuous but which have probability distributions which are continuous?
We're talking about the continuity of the function . I define it in the post, so you can check the post to see exactly what I'm talking about.
And now we're into the measurement problem, which far better minds than mine have spent astounding amounts of effort on and not yet resolved. Again, assuming I am understanding correctly.
This has nothing to do with how you settle the measurement problem. As I say in the post, a quantum Turing machine would have the property that this I've defined is continuous, even if it's a version that can make measurements on itself mid-computation. That doesn't change the fact that is continuous, roughly because the evolution before the measurement is unitary, and so perturbing the initial state by a small amount in -norm perturbs the probabilities of collapsing to different eigenstates by small amounts as well.
The result is that the function is continuous, even though the wave-function collapse is a discontinuous operation on the Hilbert space of states. The conclusion generalizes to any real-world quantum system, essentially by the same argument.
I think this is a terminological dispute
Fair.
and is therefore uninteresting.
Terminology is uninteresting, but important.
There is a false proof technique of the form:
Whereas your argument was:
These, at first glance, are the same.
By ignoring terminology as uninteresting and constructing arguments that initially are consistent with these false proof techniques, you're downgrading yourself in the eyes of anyone who uses Bayesian reasoning (consciously or unconsciously) and assigns a non-zero prior to you (deliberately or unwittingly) using a false proof technique.
In this case you've recovered by repairing the proof; this does not help for initial impressions (anyone encountering your reasoning for the first time, in particular before the proof has been repaired).
If you've already considered this and decided it wasn't worth it, fair I suppose. I don't think that's a good idea, but I can see how someone with different priors could plausibly come to a different conclusion. If you haven't considered this... hopefully this helps.
As far as I can see my argument was clear from the start and nobody seems to have been confused by this point of it other than you. I'll admit I'm wrong if some people respond to my comment by saying that they too were confused by this point of my argument and your comment & my response to it helped clear things up for them.
It seems to me that you're (intentionally or not) trying to find mistakes in the post. I've seen you do this in other posts as well and have messaged you privately about it, but since you said you'd rather discuss this issue in public I'm bringing it up here.
Any post relies on some amount of charity on the part of the reader to interpret it correctly. It's fine if you're genuinely confused about what I'm saying or think I've made a mistake, but your behavior seems more consistent with a fishing expedition in which you're hunting for technically wrong statements to pick on. This might be unintentional on your part or a reflection of my impatience with this kind of response, but I find it exhausting to have to address these kinds of comments that appear to me as if they are being made in bad faith.
It seems to me that you're (intentionally or not) trying to find mistakes in the post.
It is obvious we have a fundamental disagreement here, and unfortunately I doubt we'll make progress without resolving this.
Euclid's Parallel Postulate was subtly wrong. 'Assuming charity' and ignoring that would not actually help. Finding it, and refining the axioms into Euclidean and non-Euclidean geometry, on the other hand...
In his most recent appearance on the 80,000 Hours podcast, Bryan Caplan gave the following argument in favor of libertarian free will:
At first I thought this argument was obviously misguided, because we can never do this for a simple program which takes in one bit as input and then prints its complement. There's no reason that in general a computable function must have a fixed point^{[1]}, so even if we know the source code of a program perfectly we may not be able to come up with a way in which to tell the program what to do in a way that results in the program following the behavior we said it would follow. However, the issue actually turns out to be more subtle than this.
Let's make this more formal. If we have a deterministic agent such that we tell the agent what we think it will do and then the agent takes an action conditional on that information, for the purposes of our little experiment we can abstract away from all other details and just represent the agent by a function f:A→A from a space of actions A to itself. If A is just some discrete space then f could just be an arbitrary fixed point-free permutation of A, and there will be at least one such permutation so long as |A|>1. Therefore it seems like the argument doesn't work: here we have a completely deterministic agent, and yet it's impossible for us to tell it what it will do in such a way that it does what we predicted in advance.
This is actually a standard problem in fixed point theory. Maps from discrete spaces to themselves can easily fail to have fixed points, much like how there's no deterministic or pure strategy Nash equilibrium of a rock-paper-scissors game. To find the equilibrium, we need to expand the class of actions we permit the agent to take.
Suppose that instead of working with a deterministic agent, we're working with a stochastic agent. The action space of the agent is now the space of probability distributions on A, say M. This is isomorphic to a simplex with |A| vertices, so we can realize it as a convex subset of R|A|+1.
If we now imagine we give the agent a probability distribution over the actions it could take, we're in a situation where instead of a function f:A→A we have a function g:M→M. The agent reads the vector of probabilities we give it, and then decides on some probability distribution to sample its next action from.
Now the key observation: if g is continuous, then Brouwer's fixed point theorem guarantees the existence of a fixed point! In other words, for stochastic agents who follow probability distributions that are continuous in the inputs they are given, we can always find a way to tell them what they will do in such a way that they will have to do what we said they would do. Of course, all of the work here is being done by the assumption that g is continuous. Without this assumption the topological structure of the set M becomes irrelevant so there's no way to do better than what we could already do for the deterministic case.
Fortunately for the argument, the assumption of continuity for g is plausible in some real-world settings. For instance, if the agent were a quantum Turing machine that could take a wave-function as an input^{[2]}, we could communicate action probabilities to the agent by tuning the norms of the coefficients of the input wave-function in a canonical basis. In this case it's easy to see from the properties of quantum mechanics that the agent would necessarily represent a continuous function M→M and the invocation of Brouwer's fixed point theorem would therefore be valid.
If we take Caplan's argument seriously, there might be a real "test" of libertarian free will here. More precisely, it's a test that the input-output relation represented by the agent should be continuous as I've formalized it here. I think this is interesting because our understanding of physics seems to exclude effects that are truly discontinuous. For example, real-world transistors have resistance that depends continuously on the gate voltage, contrary to idealized models of transistors that people sometimes work with in textbooks.
In contrast, I think almost all proponents of libertarian free will would agree that their position predicts that an agent with such free will, such as a human, could always just choose to not do as they are told. If the distribution they are given looks like it's roughly uniform they can deterministically pick one action, and if it looks like it's very far from uniform they can just make a choice uniformly at random. The crux is that the function g this defines can't be continuous, so I believe this forces advocates of libertarian free will to the position that agents with free will must represent discontinuous input-output relations.
This seems like it should be testable in theory, though in practice we may be blocked from testing it right now by our lack of understanding of how humans actually work. If libertarian free will advocates accept this framing of the problem, this hypothetical experiment might actually be a way to convince them to change their minds.
Sometimes people define a "fixed point" of a total computable function F by saying that it's some e such that e and F(e) represent the same partial recursive function in some admissible numbering. Here I actually want e=F(e), which is a stronger condition and obviously there are computable functions F which admit no such e. ↩︎
Some sources define quantum Turing machines to have discrete inputs and outputs, but allow arbitrary vectors in a Hilbert space on the tape and the states of the Turing machine, i.e. during the "internal computation". This makes it infeasible to give continuously varying inputs to the QTM and is also physically unrealistic, so I don't follow that approach here. ↩︎