Minor spoilers for mad investor chaos and the woman of asmodeus (planecrash Book 1).

Also, be warned: citation links in this post link to a NSFW subthread in the story.

Criminal Law and Dath Ilan

When Keltham was very young indeed, it was explained to him that if somebody old enough to know better were to deliberately kill somebody, Civilization would send them to the Last Resort (an island landmass that another world might call 'Japan'), and that if Keltham deliberately killed somebody and destroyed their brain, Civilization would just put him into cryonic suspension immediately.

It was carefully and rigorously emphasized to Keltham, in a distinction whose tremendous importance he would not understand until a few years later, that this was not a threat.  It was not a promise of conditional punishment.  Civilization was not trying to extort him into not killing people, into doing what Civilization wanted instead of what Keltham wanted, based on a prediction that Keltham would obey if placed into a counterfactual payoff matrix where Civilization would send him to the Last Resort if and only if he killed.  It was just that, if Keltham demonstrated a tendency to kill people, the other people in Civilization would have a natural incentive to transport Keltham to the Last Resort, so he wouldn't kill any others of their number; Civilization would have that incentive to exile him regardless of whether Keltham responded to that prospective payoff structure.  If Keltham deliberately killed somebody and let their brain-soul perish, Keltham would be immediately put into cryonic suspension, not to further escalate the threat against the more undesired behavior, but because he'd demonstrated a level of danger to which Civilization didn't want to expose the other exiles in the Last Resort.

Because, of course, if you try to make a threat against somebody, the only reason why you'd do that, is if you believed they'd respond to the threat; that, intuitively, is what the definition of a threat is.

It's why Iomedae can't just alter herself to be a kind of god who'll release Rovagug unless Hell gets shut down, and threaten Pharasma with that; Pharasma, and indeed all the other gods, are the kinds of entity who will predictably just ignore that, even if that means the multiverse actually gets destroyed.  And then, given that, Iomedae doesn't have an incentive to release Rovagug, or to self-modify into the kind of god who will visibly inevitably do that unless placated.

Gods and dath ilani both know this, and have math for defining it precisely.

Politically mainstream dath ilani are not libertarians, minarchists, or any other political species that the splintered peoples of Golarion would recognize as having been invented by some luminary or another.  Their politics is built around math that Golarion doesn't know, and can't be predicted in detail without that math.  To a Golarion mortal resisting government on emotional grounds, "Don't kill people or we'll send you to the continent of exile" and "Pay your taxes or we'll nail you to a cross" sound like threats just the same - maybe one sounds better-intentioned than the other, but they both sound like threats.  It's only a dath ilani, or perhaps a summoned outsider forbidden to convey their alien knowledge to mortals, who'll notice the part where Civilization's incentive for following the exile conditional doesn't depend on whether you respond to exile conditionals by refraining from murder, while the crucifixion conditional is there because of how the government expects Golarionites to respond to crucifixion conditionals by paying taxes.  There is a crystalline logic to it that is not like yielding to your impulsive angry defiant feelings of not wanting to be told what to do.

The dath ilani built Governance in a way more thoroughly voluntarist than Golarion could even understand without math, not (only) because those dath ilani thought threats were morally icky, but because they knew that a certain kind of technically defined threat wouldn't be an equilibrium of ideal agents; and it seemed foolish and dangerous to build a Civilization that would stop working if people started behaving more rationally.

--Eliezer Yudkowsky, planecrash

"The United States Does Not Negotiate With Terrorists"

I think the idea Eliezer is getting at here is that responding to threats incentivizes threats. Good decision theories, then, precommit to never cave in to threats made to influence you, even when caving would be the locally better option, so as to eliminate the incentive to make those threats in the first place. Agents that have made that precommitment will be left alone, while agents who haven't can be bullied by threateners. So the second kind of agent will want to appropriately patch their decision theory, thereby self-modifying into the first kind of agent.

Commitment Races and Good Decision Theory

Commitment races are a hypothesized problem in which agents might do better by, as soon as the thought occurs to them, precommitting to punishing all those who don't kowtow to their utility function, and promulgating this threat. Once this precommitted threat has been knowingly made, the locally best move for others is to cave and kowtow: they were slower on the trigger, but that's a sunk cost now, and they should just give in quietly.

I think the moral of the above dath ilani excerpt is that your globally best option[1] is to not reward threateners. A dath ilani, when so threatened, would be precommitted to making sure that their threatener gets less benefit in expectation than they would have playing fair (so as to disincentivize threats, so as to be less likely to find themselves so threatened):

That's not even getting into the math underlying the dath ilani concepts of 'fairness'!  If Alis and Bohob both do an equal amount of labor to gain a previously unclaimed resource worth 10 value-units, and Alis has to propose a division of the resource, and Bohob can either accept that division or say they both get nothing, and Alis proposes that Alis get 6 units and Bohob get 4 units, Bohob should accept this proposal with probability < 5/6 so Alis's expected gain from this unfair policy is less than her gain from proposing the fair division of 5 units apiece.  Conversely, if Bohob makes a habit of rejecting proposals less than '6 value-units for Bohob' with probability proportional to how much less Bohob gets than 6, like Bohob thinks the 'fair' division is 6, Alis should ignore this and propose 5, so as not to give Bohob an incentive to go around demanding more than 5 value-units.

A good negotiation algorithm degrades smoothly in the presence of small differences of conclusion about what's 'fair', in negotiating the division of gains-from-trade, but doesn't give either party an incentive to move away from what that party actually thinks is 'fair'.  This, indeed, is what makes the numbers the parties are thinking about be about the subject matter of 'fairness', that they're about a division of gains from trade intended to be symmetrical, as a target of surrounding structures of counterfactual actions that stabilize the 'fair' way of looking things without blowing up completely in the presence of small divergences from it, such that the problem of arriving at negotiated prices is locally incentivized to become the problem of finding a symmetrical Schelling point.

(You wouldn't think you'd be able to build a civilization without having invented the basic math for things like that - the way that coordination actually works at all in real-world interactions as complicated as figuring out how many apples to trade for an orange.  And in fact, having been tossed into Golarion or similar places, one sooner or later observes that people do not in fact successfully build civilizations that are remotely sane or good if they haven't grasped the Law governing basic multiagent structures like that.)

--Eliezer, planecrash

  1. ^

    I am not clear on what the decision-theoretically local/global distinction I'm blindly gesturing at here amounts to. If I knew, I think I would fully understand the relevant updateless(?) decision theory.

New Comment
24 comments, sorted by Click to highlight new comments since: Today at 7:45 PM

One thing I've struggled to do is to apply this to cases where you can view both sides as threatening.

For example in the ultimatum game, isn't the statement "I will reject any offer below 50 %" a threat?

Or isn't the statement "I will ignore all threats" a threat itself - you're precomitting to act locally against your own best interests for the sake of forcing other people to do something better for you?

The word "threat" here is used as a loose translation for a mathematically much more precise concept in dath ilan. I am pretty sure that the real world doesn't have the mathematical sophistication to support such a definition, and perhaps such a distinction doesn't actually make mathematical sense after all. Multipolar game theory is very far from solved.

On the other hand, the author's description "foolish and dangerous to build a Civilization that would stop working if people started behaving more rationally" provides a guideline for how such a criterion might be intended to work, in simple cases. It seems to point toward an attempt to get more than the best coordinated rational actors could get, at the expense of ensuring that other parties always get less.

"I will reject any offer below 50%" seems to be not a threat in this sense for the usual formulation of the Ultimatum Game, since it allows for the respondent to get 50%, which is the best result that coordinating rational actors can get on average anyway. I'd call it a borderline threat in that arbitrarily small perturbations in the game could make it a threat, though. It's too sharp and fragile.

"I will ignore all threats" also fails to be a threat by this criterion, since it allows both parties to attain the coordinating optimum, but it also seems overly sharp. "I will reject threats with enough probability that on average it's not worth your while making them" seems like a smoother strategy.

There is a mathematically precise definition of "threat" in game theory.  It's approximately the one Yair semi-explicitly used above.  Alice threatens Bob when Alice says that, if Bob performs some action X, then Alice will respond with action Y, where Y (a) harms Bob and (b) harms Alice.  (If one wants to be "mathematical", then one could say that each combination of actions is associated with a set of payoffs, and that "action Y harms Bob" == "[Bob's payoff with Y] < [Bob's payoff with not-Y]".)  The threat should successfully deter Bob if, and only if, (1) Bob believes Alice's statement; (2) the harm inflicted on Bob by Y exceeds the benefit he gains from X; and (3) because of (b), Bob believes Alice wouldn't just do Y anyway.

If Alice has an action Z that harms Bob and benefits her, then she can't use it in a threat, because Bob would assume she'd do it anyway.  But what she can do is make a promise, that if Bob does what she wants, then she'll do action Q, which (a) helps Bob and (b) harms her; in this case Q would be "refrain from Z".

Of course, carrying out a threat or promise is by definition irrational.  But being able to change others' behavior is useful, so that's what creates evolutionary value in emotional responses like anger/revenge, gratitude/obligation, etc., and other methods of self-compulsion.

(I learned this from the book "Game Theory and Strategy" by Straffin, but you can see the same definitions given in e.g. http://pi.math.cornell.edu/~mec/2008-2009/Anema/stategicmoves.htm .)

I would be surprised if dath ilan didn't have the base concepts of game-theoretic threats and promises; and if they did, then I'm not sure what other names they would use for them.  I'm not certain about this (and have only read one dath ilan story and it wasn't "mad investor chaos"), but I suspect the authors would avoid giving new definitions of terms from Earth economics and game theory that already have precise definitions.

Alice threatens Bob when Alice says that, if Bob performs some action X, then Alice will respond with action Y, where Y (a) harms Bob and (b) harms Alice. (If one wants to be “mathematical”, then one could say that each combination of actions is associated with a set of payoffs, and that “action Y harms Bob” == “[Bob’s payoff with Y] < [Bob’s payoff with not-Y]”.)

Note that the dath ilan "negotiation algorithm" arguably fits this definition of "threat":

If Alis and Bohob both do an equal amount of labor to gain a previously unclaimed resource worth 10 value-units, and Alis has to propose a division of the resource, and Bohob can either accept that division or say they both get nothing, and Alis proposes that Alis get 6 units and Bohob get 4 units, Bohob should accept this proposal with probability < 5⁄6 so Alis’s expected gain from this unfair policy is less than her gain from proposing the fair division of 5 units apiece.

Because for X="proposes that Alis get 6 units and Bohob get 4 units" and Y="accepting the proposal with probability < 5/6", if Alis performs X, then Y harms both Alis and Bohob relative to not-Y (accepting the proposal with probability 1).

So I'm guessing that Eliezer is using some definition of "threat" that refers to "fairness", such that "fair" actions do not count as threats according to his definition.

By this definition any statement that sets any conditions whatsoever in the Ultimatum Game is a threat. Or indeed any statement setting conditions under which you might withdraw from otherwise mutually beneficial trade.

This is true.

I think, if there is any way to interpret any such statements as not being a threat, it would be of the form "I have already made my precommitments; I've already altered my brain so that I assign lower payoffs (due to psychological pain or whatever) to the outcomes where I fail to carry out my threat.  I'm not making a new strategic move; I'm informing you of a past strategic move."  One could argue that the game is no longer the Ultimatum Game, due to the payoffs not being what they are in the Ultimatum Game.

Of course, both sides would like to do this, and to be "first" to do it.  An extreme person in this vein could say "I've altered my brain so that I will reject anything less than 9-1 in my favor", and this could even be true.  Two such people would be guaranteed to have a bad time if they ran into one another, and a fairly bad time if they met a dath ilani; but one could choose to be such a person.

If both sides do set up their psychology well in advance of encountering the game, then the strategic moves are effectively made simultaneously.  One can then think about the game of "making your strategic move".

Eliezer is using some definition of "threat" that refers to "fairness", such that "fair" actions do not count as threats

This seems likely.  Much of Eliezer's fiction includes a lot of typical mind fallacy and a seemingly-willful ignorance of power dynamics and "unfair" results in equilibria being the obvious outcome for unaligned agents with different starting conditions.  

This kind of game-theory analysis is just silly unless it includes the information about who has the stronger/more-visible precommittments, and what extra-game impacts the actions will have.  It's actually quite surprising how deeply CDT is assumed (agents can freely choose their actions at the point in the narrative where it happens) in such analyses.

While I have seen the word used in that context in some game theory, it doesn't fit the meaning intended in the story at all. It's almost the exact opposite.

It also doesn't fit the use of the term in more general practice, where a great many "RL threats" in real life are not "GT threats" in this very different game theory definition.

Hmm, do you have examples of that?  If a robber holds a gun to someone's head and says "I'll kill you if you don't give me your stuff", that's clearly a threat, and I believe it also fits the game theory definition: most robbers would have at least a mild preference to not shoot the person (if only because of the mess it creates).

In the stated terms, Alice is the robber, Bob is the victim, X is "Bob resists Alice", Y is "Alice kills Bob and takes his stuff anyway", and not-Y is "Alice gives up".

It is uncontroversial that Bob is worse off under Y than not-Y, but much less certain that Alice is also worse off. If Bob resists Alice and Alice gives up, then Alice is probably going to prison for a very long time. Alice seems much better off killing Bob and taking his stuff, so this was not a "threat" under the proposed definition.

Hmm, this depends on assumptions not stated.  I was thinking of the situation where Alice has broken into Bob's house, and there are neighbors who might hear a gunshot and call the cops, and might be able to describe Alice's getaway car and possibly its license plate.  In other words, Alice shooting Bob carries nontrivial risk of getting her caught.

If we imagine the opposite, that Alice shooting Bob decreases her chance of getting caught, then, after Bob gives her his stuff, why shouldn't Alice just shoot Bob afterward?  In which case why should Bob cooperate?  To incentivize Bob, Alice would have to promise that she won't shoot him after he cooperates, rather than threaten him.  (And it's harder for an apparently-willing-to-murder-you criminal to make a credible promise than a credible threat.)

So let's flesh out the situation I imagined.  If Bob cooperates and then Alice kills him, the cops will seriously investigate the murder.  If Bob cooperates and Alice leaves him tied up and able to eventually free himself, then the cops won't bother putting so much effort into finding Alice.  Then Bob can really believe that, if he cooperates, Alice won't want to shoot him.  Now we consider the case where Bob refuses; does Alice prefer to shoot him?

If she does, then we could say that, if both parties understand the situation, then Alice doesn't need to threaten anything.  She may need to explain what she wants and show him her gun, but she doesn't need to make herself look like a madman, a hothead, or otherwise irrational; she just needs to honestly convey information.[1]  And Bob will benefit from learning this information; if he were deaf or uncomprehending, then Alice would have just killed him.

Whereas if Alice would rather not shoot Bob, then her attempts to convince Bob will involve either lying to him, or visibly making herself angry or otherwise trying to commit herself to the shoot-if-resist choice.  In this case, Bob does not benefit from being in a position to receive Alice's communications; if Bob were clearly deaf / didn't know the language / otherwise couldn't be communicated with, then Alice wouldn't try to enrage herself and would probably just leave.  (Technically, given that Alice thinks Bob can hear her, Bob benefits from actually hearing her.)

There is an important distinction to be made here.  The question is what words to use for each case.  I do think it's reasonably common for people to understand the distinction, and, when they are making a distinction, I think they use "threat" for the second case, while the first might be called "a fact" or possibly "a warning".

For a less violent case, consider one company telling their vendor, "If you don't drop your prices by 5% by next month, then we'll stop buying from you."  If that's actually in the company's interest—e.g. because they found a competing seller whose prices are 5% lower—then, again, the vendor is glad to know; but if the company is just trying to get a better deal and really hopes they're not put in a position where they have to either follow through or eat their words, then this is a very different thing.  I do think that common parlance would say that the latter is a threat, and the former is a (possibly friendly!) warning.

Incidentally, it's clear that people refer to "a thing that might seriously harm X" as "a threat to X".  In the "rational psychopath" case, Alice is a threat to Bob, but her words, her line of communication with Bob, are not—they actually help Bob.  In the "wannabe madman" case, Alice's words are themselves a threat (or, technically, the fact that Alice thinks Bob is comprehending her words).  Likewise, the communication (perhaps a letter) from the company that says they'll stop buying is itself a threat in the second case and not the first.  One can also say that the wannabe-madman Alice and the aggressively negotiating company are making a threat—they are creating a danger (maybe fake, but real if they do commit themselves) where none existed.

Now, despite the above arguments, it is possible that the bare word "threat" is not the best term.  The relevant Wikipedia article is called "Non-credible threat".  I don't think that's a good name, because if Alice truly is a madman (and has a reputation for shooting people who irritated her, and she's managed to evade capture), then, when Alice tells you to do something or she'll shoot you, it can be very credible.  I would probably say "game-theoretic threat".

[1] Though in practice she might need to convince Bob that she, unlike most people, is willing to kill him.  Pointing a gun at him would be evidence of this, but I think people would also tend to say that's "threatening"... though waving a gun around might indeed be "trying to convince them that you're irrational enough to carry out an irrational threat".  I dunno.  In game theory, one often prefers to start with situations in which all parties are rational...

By this definition the statement "I will not give into threats" is a threat itself...

It is definitely true that the author was not using the word "threat" in this way. One Two Some of the explicitly given examples fit this definition and were provided as an example of strategies that were not considered threats.

Though also consider: the character there is not from Earth, does not know what words Earth people use, they are communicating via a translation system that is known to be imprecise, the target language is also not from Earth, and that language already known to be missing words for many simple game theory concepts. The use of the word "threat" in the text is definitely not to be taken as exactly the same meaning as used in Earth game theory.

In this paradigm it would be a threat, yes.  The distinction here is "actions that make sense whether or not the target changes their behavior in response" vs "actions that only make sense if you expect the target to change their behavior in response".  If you do not want killers to be a danger outside of Japan, it makes sense to send them to Japan even if you know that they are not going to kill any less as a result of this policy.  However, Golarion crucifying non tax payers only makes sense if more people will pay taxes out of fear of crucifixion; if they do not, Golarion gains nothing and is out the cost of some wood and nails.  

In the ultimatum game, refusing offers below 50% leaves you worse off if your opponent doesn't respond to this by giving you 50%.

I'm just noting this for 'search' purposes, but I think the larger work from which your quotes are drawn is named/titled "planecrash".

Thanks -- as I've gotten further in the story, this title mismatch on my part has bothered me more and more, and I have now renamed all the sequence links to correct it. (The Glowfic website is … somewhat non-obvious.)

I've weirdly been less and less bothered since my previous comment! :)

I think "planecrash" is a better overall title still, so thanks for renaming all of the links.

Later on in the story there is stuff handling "rational revenge" where a dath ilani that is subject to a theft of a shirt with the cost of tracking down the thief being more than the value of the shirt. Alledgedly an ilani would figure out that that if the first person to be robbed started counteracting it it would do less damage than if nobody or the 10th person would do it and start the hunt at personal expence.

Individually the points seem a larger divergence but together there seems to be less difference. But I guess both have the flavour of how you deal with a non-ideal scenario rather than manipulating others into the bad situation not happening.

Later on in the story there is stuff handling “rational revenge” where a dath ilani that is subject to a theft of a shirt with the cost of tracking down the thief being more than the value of the shirt.

This is also why it's not irrational to spend more than $5 in time and gas to save $5 on a purchase.

With the purchase, it seems more like splitting the surplus. It does benefit you to have a store nearby that sells things at a lower price than you would have to pay in total by going to the less convenient store. The question is, how much of that gain is being captured by the store owner, and how much by you? If you think that they are capturing "too much" of the gains by the prices they set, then it can be rational to refuse the offer (just as in the Ultimatum Game).

One question is whether they can provide enough evidence that the division is reasonably fair. Maybe it is! There may be legitimate costs or extra risk that the local shop owner incurs versus the alternative.

Another question is what the other potential customers are likely to do. If most of them will shop there even when the owner is capturing 80% of the surplus and leaving the customers with only 20%, then it is likely not in the owner's interest to lower the prices much below 80% surplus capture. If the other customers are likely to recognize when the shop owner is capturing too much surplus (as would happen in dath ilan), then it may not be worthwhile to set the prices higher than 50% capture.

I think the logic is actually importantly different.

I thought that the inconvenient property of robbers is that they impose their transaction on you involuntarily. Shops don't / can't impose their effect in the same way. I don't see ilani making a moral commitment to foster the best deals in the same way that they declare possibly exisistenial war at the drop of the hat for the the tiniest unilateralist harm.

If the logic would apply to shops then I would wonder that if the instituion of goverment etc is worth for the individuals say 15 units but in order to have it people need to choose a 5 utility option over a 10 utility option one could argue that loss of that local optimization is worth the global optimization. The costs just being an acceptable cost of coordination. But it seems to be atleast the local sentiment is that if becoming more Homo Economicus makes goverment collapse then this proves the system is flawed.

There is the story beat where young Keltam asks for his share of rent income of all the land. It feels like there is either wiggling in that how this is not bad imposing or that the ownership is an unreal fiction. Why taking a shirt is bad but unbreakble loss of control over your lands is fine?

The relevant property isn't that someone imposes something on you, but rather that you wish to discourage the behavior in question. Going to the store that charges you less 1) saves you $5 and 2) discourages stores from setting prices that are more expensive than other stores by an amount which is less than the transaction cost of shopping at the other store. This benefits you more than saving $5 does all by itself. In fact, if you make a binding precommitment to shop at the other store even if it costs you $6 more, the store will take this into account and probably won't set the price at $5 more in the first place. (And "'irrationally' but predictably being willing to spend money to spite the store" is the way humans precommit.)

If it costs the shop to provide ther item near you 5$ because they can benefit from mass transit but moving the item to your location costs you 6$ because you can't. You could be punishing the service of making items available near your location.

Also in this case the price difference is more than the transaction cost to you.

Even in the case that the punishment works you might end up in a situation where you drive the near store to bankruptcy because they can't afford the lesser price. So you end up getting the same item and paying $1 more for it. This seems like an instance of an empty threat working out only conditional that it will be heeded.

With the shirt the point is not to convince the robber but the order enforcement. Maybe I could understand if the price could somehow be deemed "wrong" but it being financially not perfectly optimal for you in particular is far from being forbidden conduct.

If it costs the shop to provide ther item near you 5$ because they can benefit from mass transit but moving the item to your location costs you 6$ because you can’t. You could be punishing the service of making items available near your location.

Sure. The fact that putting pressure on the other store is an additional benefit beyond your savings doesn't mean that putting pressure is worth any arbitrary amount. There are certainly scenarios where shopping at the cheaper store that is expensive to reach is a bad idea.

But it's not bad just because it costs more to reach than you save on price, which is the typical rationalist line about such things.