SCP

SCP6mo

Nice! The standard method would be the BDM mechanism, and what’s neat is that it works even if people aren't risk-neutral in money. Here’s the idea.

Have people report their disvalue . Independently draw a number $z$ at random, e.g. from an exponential distribution with expectation $1 / λ = X$ . Then offer to pay them $z$ iff $z \geq Y$ .

Now it’s a dominant strategy to report your disvalue truthfully: reporting above your disvalue can never raise what you’re offered but it can risk eliminating a profitable trade, while reporting below it either makes no difference or gives you an offer you wouldn’t want to take.

This is the exact same logic behind the second-price auction.

Replying toInvulnerable Incomplete Preferences: A Formal Statement

SCP1y*

Invulnerable Incomplete Preferences: A Formal Statement

My use of 'next' need not be read temporally, though it could be. You might simply want to define a transitive preference relation for the agent over {A,A+,B,B+} in order to predict what it would choose in an arbitrary static decision problem. Only the incomplete one I described works no matter what the decision problem ends up being.

As a general point, you can always look at a decision ex post and back out different ways to rationalise it. The nontrivial task is here prediction, using features of the agent.

If we want an example of sequential choice using decision trees (rather than repeated 'de novo' choice through e.g. unawareness), it'll be a bit... (read more)

Replying toInvulnerable Incomplete Preferences: A Formal Statement

SCP1y

Invulnerable Incomplete Preferences: A Formal Statement

Yes that's right (regardless of whether it's resolute or whether it's using 'strong' maximality).

A sort of of a decision tree where the agent isn't representable as having complete preferences is the one you provide here. We can even put the dynamic aspect aside to make the point. Suppose that the agent is fact inclined to pick A+ over A, but doesn't favour or disfavour B to either one. Here's my representation: maximal choice with A+ $≻$ A and B $⋈/$ A,A+. As a result, I will correctly predict its behaviour: it'll choose something other than A.

Can I also do this with another representation, using a complete preference relation? Let's try out indifference between A+ and B. I'd... (read more)

SCP2y

Thanks. Let me end with three comments. First, I wrote a few brief notes here that I hope clarify how Independence and IIA differ. Second, I want to stress that the problem with the use of Dutch books in the articles is a substantial one, not just a verbal one, as I explained here and here. Finally, I’m happy to hash out any remaining issues via direct message if you’d like—whether it’s about these points, others I raised in my initial comment, or any related edits.

SCP2y

I don't apprecaite the hostility. I aimed to be helpful in spending time documenting and explaining these errors. This is something a heathy epistemic community is appreciative of, not annoyed by. If I had added mistaken passages to Wikipedia, I'd want to be told, and I'd react by reversing them myself. If any points I mentioned weren't added by you, then as I wrote in my first comment:

...let me know that some of the issues I mention were already on Wikipedia beforehand. I’d be happy to try to edit those.

The point of writing about the mistakes here is to make clear why they indeed are mistakes, so that they aren't repeated. That has value. And although I don't think we should encourage a norm that those who observe and report a problem are responsible for fixing it, I will try to find and fix at least the pre-existing errors.

•••

SCP2y

I agree that there exists the dutch book theorem, and that that one importantly relates to probabilism

I'm glad we could converge on this, because that's what I really wanted to convey.^[1] I hope it's clearer now why I included these as important errors:

The statement that the vNM axioms “apart from continuity, are often justified using the Dutch book theorems” is false since these theorems only relate to belief norms like probabilism. Changing this to 'money pump arguments' would fix it.
There's a claim on the main Dutch book page that the arguments demonstrate that “rationality requires assigning probabilities to events [...] and having preferences that can be modeled using the von Neumann–Morgenstern axioms.” I

... (read more)

•••

SCP2y

I think it'll be helpful to look at the object level. One argument says: if your beliefs aren't probabilistic but you bet in a way that resembles expected utility, then you're succeptible to sure loss. This forms an argument for probabilism.^[1]

Another argument says: if your preferences don't satisfy certain axioms but satisfy some other conditions, then there's a sequence of choices that will leave you worse off than you started. This forms an agument for norms on preferences.

These are distinct.

These two different kinds of arguments have things in common. But they are not the same argument applied in different settings. They have different assumptions, and different conclusions. One is typically called a... (read more)

SCP2y

check the edit history yourself by just clicking on the "View History" button and then pressing the "cur" button

Great, thanks!

I hate to single out OP but those three points were added by someone with the same username (see first and second points here; third here). Those might not be entirely new but I think my original note of caution stands.

Replying toCoalitional agency

SCP2y

Coalitional agency

Scott Garrabrant rejects the Independence of Irrelevant Alternatives axiom

*Independence, not IIA. Wikipedia is wrong (as of today).

SCP2y*

I appreciate the intention here but I think it would need to be done with considerable care, as I fear it may have already led to accidental vandalism of the epistemic commons. Just skimming a few of these Wikipedia pages, I’ve noticed several new errors. These can be easily spotted by domain experts but might not be obvious to casual readers.^[1] I can’t know exactly which of these are due to edits from this community, but some very clearly jump out.^[2]

I’ll list some examples below, but I want to stress that this list is not exhaustive. I didn’t read most parts of most related pages, and I omitted many small scattered issues. In... (read 677 more words →)

•••

Invulnerable Incomplete Preferences: A Formal Statement

SCP

Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort. My thanks to Eric Chen, Elliott Thornley, and John Wentworth for invaluable discussion and comments on earlier drafts. All errors are mine.

This article presents a few theorems about the invulnerability of agents with incomplete preferences. Elliott Thornley’s (2023) proposed approach to the AI shutdown problem relies on these preferential gaps, but John Wentworth and David Lorell have argued that they make agents play strictly dominated strategies.^[1] I claim this is false.

Summary

Suppose there exists a formal description of an agent that willingly shuts down when a certain button is pressed. Elliott Thornley’s (2023) Incomplete Preference Proposal aims to offer... (read 10250 more words →)

136

Rational Unilateralists Aren't So Cursed

SCP

Much informal discussion of the Unilateralist’s Curse from Bostrom et al (2016) presents it as a sort of collective action problem: the chance of purely altruistic agents causing harm rises with the number of agents that act alone. What’s often left out is that these agents are irrational. The central result depends on this, and I’ll show why below.

Note that the formal result in the original paper is correct. The authors are largely aware of what I’ll explain here; they discuss much of it in section 3.2. The point of this post is (i) to correct a misconception about which agents the Curse applies to, (ii) to go through a particularly neat... (read 1740 more words →)

What’s this probability you’re reporting?

EOC

EOC, SCP

It’s unclear what people mean when saying they’re reporting a probability according to their inside view model(s). We’ll look through what this could mean and why most interpretations are problematic. Note that we’re not making claims about which communication norms are socially conducive to nice dialogue. We’re hoping to clarify some object-level claims about what kinds of probability assignments make sense, conceptually. These things might overlap.

Consider the following hypothetical exchange:

Person 1: “I assign 90% probability to X”
Person 2: “That’s such a confident view considering you might be wrong”
Person 1: “I’m reporting my inside view credence according to my model(s)”

This response looks coherent at first glance. But it’s unclear what Person 1 is actually saying.... (read 804 more words →)

LESSWRONG
LW

LESSWRONG
LW

SCP

SCP

Invulnerable Incomplete Preferences: A Formal Statement

Rational Unilateralists Aren't So Cursed

What’s this probability you’re reporting?

SCP

SCP

SCP

Invulnerable Incomplete Preferences: A Formal Statement

Rational Unilateralists Aren't So Cursed

What’s this probability you’re reporting?

Summary