No, the whole premise of the face-reading scenario is that the agent *can* tell that his face is being read, and that's why he pays the money. If the agent can't tell whether his face is being read, then his correct action (under FDT) is to pay the money if and only if (probability of being read) times (utility of returning to civilization) is greater than (utility of the money). Now, if this condition holds but in fact the driver can't read faces, then FDT does pay the $50, but this is just because it got unlucky, and we shouldn't hold that against it.

In your new dilemma, FDT does not say to pay the $50. It only says to pay when the driver's decision of whether or not to take you to the city depends on what you are planning to do when you get to the city. Which isn't true in your setup, since you assume the driver can't read faces.

15d

The agent in this scenario doesn't necessarily know if the driver can read faces
or not, in the original problem the agent isn't aware of this information.
Surely if FDT advises you pay him on arrival in the face reading scenario, you
would do the same in the non-face reading scenario since the agent can't tell
them apart.

1mo1212

a random letter contains about 7.8 (bits of information)

This is wrong, a random letter contains log(26)/log(2) = 4.7 bits of information.

124d

Whoops! Thanks.

3mo41

I have tinnitus every time I think about the question of whether I have tinnitus. So do I have tinnitus all the time, or only the times when I notice?

4mo30

I was confused at first what you meant by "1 is true" because when you copied the post from your blog you didn't copy the numbering of the claims. You should probably fix that.

5mo31

The number 99 isn’t unique—this works with any payoff between 30 and 100.

Actually, it only works with payoffs below 99.3 -- this is the payoff you get by setting the dial to 30 every round while everyone else sets their dials to 100, so any Nash equilibrium must beat that. This was mentioned in jessicata's original post.

Incidentally, this feature prevents the example from being a subgame perfect Nash equilibrium -- once someone defects by setting the dial to 30, there's no incentive to "punish" them for it, and any attempt to create such an incentive via a "punish non-punishers" rule would run into the trouble that punishment is only effective up to the 99.3 limit.

It's part of the "frontpage comment guidelines" that show up every time you make a comment. They don't appear on GreaterWrong though, which is why I guess you can't see them...

I explained the problem with the votes-per-dollar formula in my first post. 45% of the vote / $1 >> 55% of the vote / $2, so it is not worth it for a candidate to spend money even if they can buy 10% of the vote for $1 (which is absurdly unrealistically high).

When I said maybe a formula would help, I meant a formula to explain what you mean by "coefficient" or "effective exchange rate". The formula "votes / dollars spent" doesn't have a coefficient in it.

...If one candidate gets 200 votes and spends 200 dollars, and candidate 2 gets 201 votes and spen

15mo

"it is not worth it for a candidate to spend money even if they can buy 10% of
the vote for $1 (which is absurdly unrealistically high)."
So what is a realistic price / 'exchange rate' for this example, in your
opinion?
I provided a coefficient of '1' spelled out in the line below that, it could be
'10' or '100', etc.
"Sure, and my proposal of Votes / (10X + Y) would imply that the first candidate
wins."
Which invariant(s) would you construe this as maintaining? Why not just add a
constant coefficient? This is more efficient to compute, and the average price
is already too high, that's 'half the point'.

I don't think the data dependency is a serious problem, all we need is a very loose estimate. I don't know what you mean by a "spending barrier" or by "effective exchange rate", and I still don't know what coefficient you are talking about. Maybe it would help if you wrote down some formulas to explain what you mean.

16mo

"votes-per-dollar" IS a formula -- aka v/d -- 'per' means division -- spelling
out the coefficient we have:
1*v/d where 'undefined' is ranked last and infinity is not a return value
OR
1*v/(1+d) where 1/0 approached from the right is +inf
(there are no negative votes -- if dollars are negative eg a campaign turns a
profit we could take abs(1+d) or abs(d) as the denominator)
v = total votes for the candidate
d = total dollars spent by the candidate
But here's a basic unit test, riddle me this:
If one candidate gets 200 votes and spends 200 dollars, and candidate 2 gets 201
votes and spends two MILLION dollars, who has the strongest mandate, in the
sense that the representative actually represents the will of the people when
wealth differences are ignored?

I don't understand what you mean; multiplying the numerator by a coefficient wouldn't change the analysis. I think if you wanted to have a formula that was somewhat sensitive to campaign spending but didn't rule out campaign spending completely as a strategy, Votes/(10X+Y) might work, where Y is the amount spent of campaign spending, and X is an estimate of average campaign spending. (The factor of 10 is because campaign spending just isn't that large a factor to how many votes you get in absolute terms; it's easy to get maybe 45% of the vote with no campaign spending at all, just by having (D) or (R) in front of your name.)

16mo

This counterproposal has a data dependency where we need to know averages from
the past, and also will still present a spending barrier for dirt-poor
candidates if the average happens to be large (it is).
What I meant in response to your original comment is that whether it's 'worth
it' depends on the current 'effective exchange rate' between votes and dollars,
which is represented by a coefficient of '1' in this first approximation. There
should presumably be an update rule for 'learning' the 'correct' coefficient....

6mo20

The result of this will be that no one will spend more than the $1 minimum. It's just not worth it. So your proposal is basically equivalent to illegalizing campaign spending.

16mo

what coefficient in the numerator would change your conclusion?

I wonder whether this one is true (and can be easily proved): For a normal form game G and actions ai for a player i, removing a set of actions a−i from the game yields a game G− in which the Nash equilibria are worse on average for i (or alternatively the pareto-best/pareto-worst Nash equilibrium is worse for G− than for G).

It's false: consider the normal form game

(0,0) (2,1)

(1,1) (3,0)

For the first player the first option is dominated by the second, but once the second player knows the first player is going to choose the second option, he's moti...

7mo43

Not eating meat is not a Pascal's mugging because there is a solid theoretical argument for why the expected value is positive even if the payoff distribution is somewhat unbalanced: if a large number of people decide not to eat meat, then this will necessarily have the effect of shifting production, for supply to meet demand. Since you have no way of knowing where you are in that large ensemble, the expected value of you not eating meat is equal to the size of the effect divided by the number of people in the ensemble, which is presumably what we would ex...

7mo10

A proof you don’t understand does not obligate you to believe anything; it is Bayesian evidence like anything else. If an alien sends a 1GB Coq file Riemann.v, running it on your computer does not obligate you to believe that the Riemann hypothesis is true. If you’re ever in that situation, do not let anyone tell you that Coq is so awesome that you don’t roll to disbelieve. 1GB of plaintext is too much, you’ll get exhausted before you understand anything. Do not ask the LLM to summarize the proof.

I'm not sure what you are trying to say here. Even with 1...

17mo

I think a coq bug or the type of mistakes that are easy to make in formalization
are more likely than a transistor failure or a bad bit flip bubbling up through
the stack, but still not what I'm talking about.
In the background, I'm thinking a little about this terry tao post, about how
the process of grokking the proof (and being graceful with typos, knowing which
typos are bad and which are harmless, and so on) is where the state of the art
mathematics lies, not in the proof itself.
I was discussing your comment with a friend, who suggested what I'm calling
factored cognition parallel auditing (FCPA): she asked why don't we just divide
the 1e6 lines of coq into 1000-line segments and send it out to 1000 experts
each of whom make sense of their segment? The realities would be a little
messier than linearly stitching together segments, but this basic setup (or
something that emphasizes recursion, like a binary search flavored version)
would I think buy us roughly 2-3x the assurance that RH true over just blindly
trusting the type theory / laptop to execute Riemann.v in the coq session.
In the current comment, let's assume a background of russell-like "enfeeblement"
or christiano-like "losing control of the world/future in a slow burn",
filtering out the more yudkowsky-like threatmodels. Not to completely discount
malicious injections in Riemann.v, but they don't seem productive to emphasize.
I will invoke the vague notions of "theorem intricacy" and "proof length", but
it should be possible to not read this footnote[1] and still follow along with
the goals of the current comment.
This isn't really about RH, RH isn't necessarily important, etc. I was actually
alluding to / vaguely criticizing a few different ideas at once in the paragraph
you quoted.
1. The proof assistants or applied type theory technologies will help agent
foundations researchers progress and/or teach. I'm bearish on this largely
because I think agent foundations researchers care more

7mo30

Thus, for example, intransitivity requires giving up on an especially plausible Stochastic Dominance principle, namely: if, for every outcome o and probability of that outcome p in Lottery A, Lottery B gives a better outcome with at least p probability, then Lottery B is better (this is very similar to “If Lottery B is better than Lottery A no matter what happens, choose Lottery B” – except it doesn’t care about what outcomes get paired with heads, and which with tails).

This principle is phrased incorrectly. Taken literally, it would imply that the mixe...

27mo

Oops! You're right, this isn't the right formulation of the relevant principle.
Will edit to reflect.

It seems like "equally probable MWI microstates" is doing a lot of work here. If we have some way of determining how probable a microstate is, then we are already assuming the Born probabilities. So it doesn't work as a method of deriving them.

28mo

Well, microstates come before probabilities. They are just there, while
probabilities are in the model that describes macrostates (emergence). This is
similar to how one calculates entropy with the Boltzmann equation, assigning
microstates to (emergent) macrostates, S= k ln W. But yes, there is no known
argument that would derive the Born rule from just counting microstates.
Anything like that would be a major breakthrough.

That quote seems nonsensical. What do the Born probabilities have to do with a counting argument, or with the dimension of Hilbert space? A qubit lives in a two-dimensional space, so a dimension argument would seem to suggest that the probabilities of the qubit being 0 or 1 must both be 50%, and yet in reality the Born probabilities say they can be anything from 0% to 100%. What am I missing?

48mo

I think what you are missing is the quantum->classical transition. In a simple
example, there are no "particles" in the expression for quantum evolution of an
unstable excited state, and yet in a classical world you observe different decay
channels. with an assortment of particles, or at least of particle momenta. They
are emergent from unitary quantum evolution, and in MWI they all happen. If one
could identify equally probable "MWI microstates" that you can count, like you
often can in statistical mechanics, then the number of microstates corresponding
to a given macrostate would be proportional to the Born probability. That is the
counting argument. Does this make sense?

-"“Percentage of marriages that end in divorce” is an underspecified concept. There is only “percentage of marriages that end in divorce after n years”. "

The concept is perfectly well specified, just take n to be e.g. 75. But of course, it can only be measured for cohorts that are at least that old. Still, I would have assumed it possible to do some extrapolation to estimate what the value will be for younger cohorts (e.g. the NYT article you linked to says "About 60 percent of all marriages that eventually end in divorce do so within the first 10 years", ...

18mo

Fair enough. I have removed that part of the post.

8mo10

-"If you have heard that “40% of marriages end in divorce” or some similar figure, you are probably misinterpreting the divorce-to-marriage ratio. "

Really? So what is the right number then? A cursory Google search shows 40-50% is a commonly repeated figure for "percentage of marriages that end in divorce", are you really claiming that all of those webpages are misinterpreting the divorce-to-marriage ratio? What is the basis for such a claim? It does not appear to be in the Wikipedia article, which says nothing about the percentage of marriages that end in divorce.

18mo

"Percentage of marriages that end in divorce" is an underspecified concept.
There is only "percentage of marriages that end in divorce after n years".
According to this NYT article, it is incredibly common to report (new divorces /
new marriages) as the "% of marriages ending in divorce" – and this is
misleading because it makes a decline in marriages look like an increase in the
probability of divorce. The very large figures, like 50% or above, seem to be
indeed reporting this figure.
You could adjust for changing demographics of course, but one would think that
introduces some uncertainty into the measurement.
I phrased my comment clumsily as a reason to link to something that would
clarify the underspecified comment of a "divorce rate". It seems like 40% is a
realistic figure for some countries after you do demographic adjustment, but,
insofar as a higher figure is reported, it is because of a misunderstanding.
I will think of a way to clarify this claim, or delete it entirely.

To explain my disagree-vote: I think such a system would necessarily create a strong bias against downvotes/disagree-votes, since most people would just not downvote rather than making a justifying comment. "Beware trivial inconveniences"

18mo

Pardon me. I guess its type is N∞.

9moΩ240

The infinite autoresponse example seems like it would be solved in practice by rational ignorance: after some sufficiently small number of autoresponses (say 5) people would not want to explicitly reason about the policy implications of the specific number of autoresponses they saw, so "5+ autoresponses" would be a single category for decisionmaking purposes. In that case the induction argument fails and "both people go to the place specified in the message as long as they observe 5+ autoresponses" is a Nash equilibrium.

Of course, this assumes people haven...

29mo

I agree that something in this direction could work, and plausibly captures
something about how humans reason. However, I don't feel satisfied. I would want
to see the idea developed as part of a larger framework of bounded rationality.
UDT gives us a version of "never be harmed by information" which is really nice,
as far as it goes. In the cases which UDT helps with, we don't need to do
anything tricky, where we carefully decide which information to look at -- UDT
simply isn't harmed by the information, so we can think about everything from a
unified perspective without hiding things from ourselves.
Unfortunately, as I've outlined in the appendix, UDT doesn't help very much in
this case. We could say that UDT guarantees that there's no need for "rational
ignorance" when it comes to observations (ie, no need to avoid observations),
but fails to capture the "rational ignorance" of grouping events together into
more course-grained events (eg "5+ auto responses").
So if we had something like "UDT but for course-graining in addition to
observations", that would be really nice. Some way to deal with things such that
you never wish you'd course-grained things.
Whereas the approach of actually course-graining things, seems a bit doomed to
fragility and arbitrariness. It seems like you have to specify some procedure
for figuring out when you'd want to course-grain. For example, maybe you start
with only one event, and iteratively decide how to add details, splitting the
one event into more events. But I feel pessimistic about this. I feel similarly
pessimistic about the reverse, starting with a completely fine-grained model and
iteratively grouping things together.
Fortunately, the induction argument involves both agents following along with
the whole argument. If one agent doubts that the other thinks in this way, this
can sort of stabilize things. It's similar to the price-undercutting dynamic,
where you want to charge slightly less than competitors, not as little

29mo

I've been a longtime CK atheist (and have been an influence on Abram's post),
and your comment is in the shape of my current preferred approach.
Unfortunately, rational ignorance seems to require CK that agents will engage in
bounded thinking, and not be too rational!
(CK-regress like the above is very common and often non-obvious. It seems
plausible that we must accept this regress and in fact humans need to be Created
Already in Coordination, in analogy with Created Already in Motion)
I think it is at least possible to attain p-CK in the case that there are enough
people who aren't "inductively inclined". This sort of friction from people who
aren't thinking too hard causes unbounded neuroticism to stop and allow
coordination. I'm not yet sure if such friction is necessary for any agent or
merely typical.

9mo10

-"this is just the lie algebra, and is why elements of it are always invertible."

First of all, how did we move from talking about numbers to talking about Lie algebras? What is the Lie group here? The only way I can make sense of your statement is if you are considering the case of a Lie subgroup of GL(n,R) for some n, and letting 1 denote the identity matrix (rather than the number 1) [1]. But then...

Shouldn't the Lie algebra be the monad of 0, rather than the monad of 1? Because usually Lie algebras are defined in terms of being equipped with two operati...

-"On any finite dim space we have a canon inner product by taking the positive definite one."

What? A finite dimensional space has more than one positive definite inner product (well, unless it is zero-dimensional), this choice is certainly not canonical. For example in R^2 any ellipse centered at the origin corresponds to a positive definite inner product.

19mo

I was thinking the one corresponding to a unit circle, just the ordinary dot
product.
Canon is probably the wrong word in a mathy context.

9mo21

I know this is not your main point, but the Millenium Problems are not an instance of a way for an AGI to quickly get money. From the CMI website:

-"Before CMI will consider a proposed solution, all three of the following conditions must be satisfied: (i) the proposed solution must be published in a Qualifying Outlet (see §6), and (ii) at least two years must have passed since publication, and (iii) the proposed solution must have received general acceptance in the global mathematics community"

9mo40

-"The more time passes, the more evidence we get that (4) is false from computational tests and also the more we should doubt any proof (3) as its complexity grows. Therefore, the logical probability of (1) and (2) is growing over time."

The fact that the methods we use to check proofs are not perfectly reliable (and I think you greatly overstate the importance of this consideration, computer proof checkers are very reliable) does not at all imply that the probability that a proof *exists* is decreasing over time. You need to distinguish between the fact of a...

Just to be clear, Bertrand's postulate is actually a theorem (i.e. a known result), not a postulate/hypothesis. It is called a "postulate" for historical reasons.

9mo1

-"Bayesians reason about truth this way because they are designed to treat knowledge as justified, true belief, reflecting a popular theory in analytic philosophy of what knowledge is."

Hasn't that theory been discredited (by Gettier)? I don't think it is popular anymore.

29mo

Yes, but the point stands that, to the best of my understanding, the this is the
sort of knowledge folks like von Neumann had in mind when they laid the
groundwork for what would become our modern model of Bayesian reasoners.

Let's limit our attention to the three hypotheses (a) there is no correlation between names and occupations, (b) the Pelham paper is right that Dennises are about 1% more likely to go into dentistry, and (c) the effect is much larger, e.g. Dennises are 100% more likely to go into dentistry. Then Bayes' theorem says observing a Dennis in dentistry increases the odds ratio P(b)/P(a) by a factor of 1% and the odds ratio P(c)/P(a) by a factor of 100%. You say you consider (a) and (b) to each have prior probability of 50%, which presumably means (c) has negligi...

Dec 19, 20221-2

I think quantum mechanics and the MWI are a red herring here. The real question is whether you can compute the probability of counterfactual events like WWII not happening -- and as Viliam says, the answer to that question is that it depends on choosing a starting point in the past to diverge from. Of course, once you choose such a point, actually estimating the probability of WWII not happening is an exercise in historical reasoning, not quantum physics.

-"One way to put this is that they aren’t made true by our desires, goals, standards, values, beliefs, and so on."

OK, I am a moral realist under this formulation.

-"Rather, they are true in a way more like how claims about e.g., the mass of an object are true."

I guess it depends on what you mean by "in a way more like". Moral claims are pretty fundamentally different from physical claims, I don't see how to get around that -- one way to put it would be that the notions of right and wrong are not inductive generalizations over observed phenomena -- another w...

19mo

Moral realists are going to differ with respect to what they think the
metaphysical status of the moral facts are. Moral naturalists may see them
roughly as a kind of natural fact, so moral facts might be facts about e.g.,
increases in wellbeing, while non-naturalists would maintain that moral facts
aren't reducible to natural facts.

-"In doing so, they aren’t appeal to their own values, or anyone else’s values, but to facts about what’s morally right or wrong that are true without reference to, and in a way that doesn’t depend on, any particular evaluative standpoint."

OK, so now it sounds like I am not a moral realist! I definitely think that by making a moral claim you are appealing to other people's values, since other people's values is the only thing that could possibly cause them to accept your moral claim. However, the moral claim is still of the form "X is true regardless of whether it is consistent with anyone's values".

19mo

A moral realist would think that there are facts about what is morally right or
wrong that are true regardless of what anyone thinks about them. One way to put
this is that they aren't made true by our desires, goals, standards, values,
beliefs, and so on. Rather, they are true in a way more like how claims about
e.g., the mass of an object are true. Facts about the mass of an object aren't
made true by our believing them or preferring them to be the case.

-"I think the central question would be: Do you think that there are facts about what people morally should or shouldn’t do, or what’s morally good or bad, that are true independent of people’s goals, standards, or values? If yes, that’s moral realism. If not, that’s moral antirealism."

I certainly don't believe that the truth of moral facts is dependent on people's goals, standards, or values; the qualifier I would give is that our beliefs about moral facts are the same thing (tautologically) as our moral standards. So I guess I am a moral realist? Or mayb...

19mo

I'm not sure if you're a moral realist. What do you mean when you say this?
A moral realist may think that there are e.g., facts about what you should or
shouldn't do that you are obligated to comply with independent of whether doing
so would be consistent with your goals, standards, or values. So, for instance,
they would hold that you "should't torture babies for fun," regardless of
whether doing so is consistent with your values. In doing so, they aren't appeal
to their own values, or anyone else's values, but to facts about what's morally
right or wrong that are true without reference to, and in a way that doesn't
depend on, any particular evaluative standpoint.

9mo10

(Disclaimer: didn't read the post, it is too long and I doubted it would engage with my views.)

I'm not sure how popular moral anti-realism actually is here. For example, Eliezer's position was technically moral realist, though his metaethics was kind of strange.

I'm not sure whether to classify myself as a moral realist or anti-realist. Regarding your litmus test "it's wrong to torture babies for fun" I find myself saying that it's true in a sense, but in a different sense than we normally use the word "true". How important this difference is depends on whe...

19mo

Can you elaborate on Eliezer being a moral realist? Is there a summary anywhere
or could you provide one?
Regarding this statement: "it's wrong to torture babies for fun," this is a
normative moral claim, not a metaethical one. A moral antirealist can agree with
this (I'm an antirealist, and I agree with it). Nothing about agreeing or
disagreeing with that claim entails realism.
Your position sounds like antirealism to me, but I'm not sure if it would fit
with any of the standard categories. A lot hinges on your statement that:
If you were claiming that moral claims, despite appearing to be saying things
that were true or false, were actually, instead, used to condemn acts/people,
that would sound like some type of expressivism/noncognitivism, but since you're
also trying to maintain use of the term "true," I'm not sure what to make of it.
Omnizoid's suggestion of quasi-realism makes some sense since part of the goal
is to maintain the ability to say that one's moral claims are true while still
treating them as largely serving an expressive role; those accounts hinge on
deflationary views of truth though and it doesn't sound exactly like you're
endorsing that.
I think the central question would be: Do you think that there are facts about
what people morally should or shouldn't do, or what's morally good or bad, that
are true independent of people's goals, standards, or values? If yes, that's
moral realism. If not, that's moral antirealism.

39mo

That sounds like anti-realism -- probably some type of quasi realism.

You seem to be claiming that it is possible for mathematical truths such as 2+2=5 to be other than what they are; I can agree with this on an epistemological level (since we don't know all mathematical truths) but on on ontological level, no: mathematical truths are necessary truths. This is the conventional view though I'm not really sure how to argue it to a skeptic: but if you don't see why 2+2=4 is a necessary truth then I claim you don't truly comprehend why 2+2=4.

Mathematical facts are facts about well-defined what-if scenarios. We evolved to be able to consider such scenarios because they often bear a resemblance to what happens to us. So there is an explanation for how our beliefs about mathematics could become correlated with mathematical truth, even though this explanation is not causal. However, it is not entirely obvious how to tell a similar story about moral truths -- why did we evolve to be able to perceive moral facts, if indeed we did?

19mo

I'm not saying that we perceive mathematical facts. Rather that if there is a
non perceptual.and therefore non causal epistemology for mathematics, there
could be for other things.

10moΩ11-1

-"For example, I could imagine laws requiring anyone scraping the internet to ensure that they are not collecting data from people who have denied consent to have their data scraped."

In practice this is already the case, anyone who doesn't want their data scraped can put up a robots.txt file saying so, and I imagine big companies like OpenAI respect robots.txt. I guess there could be advantages in making it a legal rule but I don't think it matters too much.

You seem to have misunderstood the problem statement [1]. If you commit to doing "FDT, except that if the predictor makes a mistake and there’s a bomb in the Left, take Right instead", then you will almost surely have to pay $100 (since the predictor predicts that you will take Right), whereas if you commit to using pure FDT, then you will almost surely have to pay nothing (with a small chance of death). There really is no "strategy that, if the agent commits to it before the predictor makes her prediction, does better than FDT".

[1] Which is fair enough, a...

110mo

Why? FDT isn't influenced in its decision by the note, so there is no loss of
subjunctive dependence when this assumption isn't added. (Or so it seems to me:
I am operating at the limits of my FDT-knowledge here.)

210mo

It sounds like you’re saying that I correctly understood the problem statement
as it was written (but it was written incorrectly); but that the post
erroneously claims that in the scenario as (incorrectly) written, FDT says to
take Left, when in fact FDT in that scenario-as-written says to take right. Do I
understand you?

-"Charging a toll for a bridge you didn’t build is not okay; that’s pure extraction."

This is probably just a nitpick, but as worded this doesn't take into account the scenario where the builder of the bridge sells the rights to charge a toll to another party, who can then legitimately charge the toll even though they didn't build the bridge.

Yes they do. For simplicity suppose there are only two hosts, and suppose host A precommits to not putting money host B's box, while host B makes no precommitments about how much money he will put in host A's box. Then the human's optimal strategy is "pick host A's box with probability 1 - x epsilon, where x is the amount of money in host A's box". This incentivizes host B to maximize the amount in host A's box (resulting in payoff ~101 for the human), but it would have been better for him if he had precommitted to do the same as A, since then by symmetry his box would have been picked half the time instead of 101 epsilon of the time.

Couldn't you equally argue that they will do their best not to be smallest by not putting any money in all their opponent's boxes? After all, "second-fullest" is the same as "third-emptiest".

Why would precommitting to pick the second-fullest box give an incentive for predictors to put money in everyone else’s boxes?

210mo

Only for stupid/partial predictors. If they predict each other and know they're
symmetrical, nothing the agent does matters. If they're trying to optimize
against other predictors who they don't think are as smart as they, they can
hope that some will make mistakes, and they will do their best not to be biggest
by putting money in all their opponents' boxes. This is the same incentive as
"pick the smallest amount" would be, but in the case that any mistakes DO get
made, and there are varying amounts, the agent prefers more money to less.

If the hosts move first logically, then TDT will lead to the same outcomes as CDT, since it's in each host's interest to precommit to incentivising the human to pick their own box -- once the host has precommitted to doing this, the incentive works regardless of what decision theory the human uses. In math terms, if x is the choice of which box to incentivize (with "incentivize your own box" being interpreted as "don't place any money in any of the other boxes"), the human gets to choose a box f(x) on the basis of x, and the host gets to choose x=g(f) on t...

110mo

It's in the hosts interests to do that if they think the player is CDT, but it's
not in their interests to commit to doing that. They don't lose anything by
retaining the ability to select a better strategy later after reading the
players mind.

310mo

If the hosts are all predicting box 1, why does it matter with what probability
the human picks box 1? (If the hosts' payoffs for all-predict-correctly and
all-predict-incorrectly are different, then their game isn't zero-sum.)

1y54

You seem to be assuming the human moves first in logical time, before the superintelligent hosts. You also seem to be assuming that the superintelligent hosts are using CDT (if they use FDT, then by symmetry considerations all of their possible actions have equal payoff, so what they do is arbitrary). Any particular reason for these assumptions?

Where do the numbers $152 and $275 come from? I would have thought they should be $100 and $200, respectively.

In the 5 box problem, why doesn't FDT force all of the incentives into box 1, thus getting $400?

110mo

The hosts aren't competing with the human, only each other, so even if the hosts
move first logically they have no reason or opportunity to try to dissuade the
player from whatever they'd do otherwise. FDT is underdefined in zero-sum
symmetrical strategy games against psychological twins, since it can foresee a
draw no matter what, but choosing optimal strategy to get to the draw still
seems better than playing dumb strategies on purpose and then still drawing
anyway.
Why do you think they should be $100 and $200? Maybe you could try simulating
it?
What happens if FDT tries to force all the incentives into one box? If the hosts
know exactly what every other host will predict, what happens to their zero-sum
competition and their incentive to coordinate with FDT?

-"The main question is: In the counter-factual scenario in which TDT recommends action X to agent A , what does would another agent B do?"

This is actually not the main issue. If you fix an algorithm X for agent A to use, then the question "what would agent B do if he is using TDT and knows that agent A is using algorithm X?" has a well-defined answer, say f(X). The question "what would agent A do if she knows that whatever algorithm X she uses, agent B will use counter-algorithm f(X)" then also has a well-defined answer, say Z. So you could define "the res...

Can't this be answered by an appeal to the fact that the initial state of the universe is supposed to be low-entropy? The wavefunction corresponding to one of the worlds, run back in time to the start of the universe, would have higher entropy than the wavefunction corresponding to all of them together, so it's not as good a candidate for the starting wavefunction of the universe.