Epistemic Status: I was bored so I wrote this, forgetting that I was probably making a poor attempt of reinventing some wheel, and realising how I forgot to solve

Motivation: To formulate the stereotypical “Obedience Test” in game theory terms.

The idea came into my mind when I saw the “Three CIA Candidates” joke:

The goal of the test was stated as “To know that you will follow instructions, no matter what.”, and apparently the sure-fire way to know this is to let the subject make a choice of harming themselves (or troubling their conscience) or to disobey command. Obedience is taken as the willingness to put oneself and their own conscience under harm in order to obey the command.

And the command in the obedience test probably serves little or no higher purpose other than being a test, and would probably appear so. (Arguably, if the subject believes the test serves a higher purpose, the instructor may still need to find out what happens when him and the purpose contradicts each other)

This pattern seems sensible, although maybe not logically vigorous.

I can think of a few other examples with similar logic:

God tested whether Abraham would be willing to sacrifice his beloved son Isaac, and when Abraham was about to kill Isaac the Angel of the Lord (to God’s credit compared to the other examples) stopped Abraham and acknowledged that Abraham showed his fear towards God.

Ancient Chinese general Wu Qi allegedly (this incident’s historicity was under question) murdered his wife, who was from the State of Qi (the enemy of the State of Lu, which Wu Qi was serving under), as the Lu leadership was hesitant to assign Wu Qi as an army commander concerning his loyalty.

Tokugawa Ieyasu executed his own wife Lady Tsukiyama and forced his first son Nobuyasu to suicide, when his ally/master Oda Nobunaga showed allegations that Tsukiyama conspired against the Oda Clan.

And I guess it is the same logic that lies behind gang initiation murders and the same logic that makes fraternity/sorority hazing rituals dangerous.

Now let’s try to formulate the process:

(This game has incomplete information, and so I haven't solved it just yet)

We will have a “monarch” M, who gives commands to the “subject” S, and M would like to figure out how “obedient” (denoted by *b, *a positive real number) was to him/her. M chooses a test with “repulsiveness” (denoted by *r*, a positive real number)*, *and S can choose to either pass or fail the test, with payoff to S as follows:

Pass: -*r*

Fail: -*b*

Intuitively, if S finds the test’s repulsiveness worse than disobeying the command, S fails the test, if S finds the test less repulsive than disobeying the command, S obeys and passes the test.

Hence in the naïve first round, S passing the test would mean that *b* is at least as large as *r*, which assures M that S is at least somewhat obedient.

However, there is probably little purpose knowing whether S is obedient or not unless M wants to assign S some job to do.

The second round: the job, tax rate, and defection.

Assuming S is an extremely capable subject and the only reason why S might fail a job is disobedience/disloyalty to M.

Define a “tax rate” *t* as a real number between 0 and 1, and *t* is known to both M and S before the start of the game.

M will choose a job with value *v*, a positive real number, and assign the job to S.

S can choose to stay loyal or defect as response to the job, with payoffs as following:

If S stays loyal: M receives *tv*, S receives (1-*t*)*v*

(ie: M and S divides the job’s value according to the pre-defined tax rate)

If S defects: M receives *-tv*, S receives (*v-b*)

(ie: S defects and receives all utilities from the job, but suffers from the consequences from disobeying M)

In the second round, S will defect when *tv>b* (when the share of job value M took away outweighs S’s loyalty to M), and M aims for tv=b (when S’s obedience is fully utilised), and since M does not know b, this has to be inferred from the first round.

Putting the two rounds together, the game can go three ways (ignoring the possibility that S fails the test yet M still assigns S a job with only negative knowledge about S’s obedience)

Assume that t and b are pre-defined, t is known to both M and S, b is only known to S.

1. M chooses r, S fails test. Payoffs: M:0, S:-b

2. M chooses r, S passes test.

2a. M chooses v, S stays loyal. Payoffs: M: tv, S: (1-t)v-r

2b. M chooses v, S defects. Payoffs: M: -tv, S: v-b-r

Now since the second round will yield positive utility to S, expecting this means S will choose to pass the test even if r>b.

In the first round S does not know what value v will be, but we can still work something out on what would happen in this case:

If S chooses to pass the test, it means the expected v is high enough such that either

*(v-r-b)>-b* or *(1-t)v-r>-b* has to be true.

When *r<b, (1-t)v-r>-b* is always true since *(1-t)v>=0*

When* r>b *this becomes somewhat different:

Either *(1-t)v>r-b* (loyal), or *v-r>0* (defect)

By my rather problematic construction of the game, the expected payoff of defection vs failing the test is only dependent on v and r, not b. This means if S expects a valuable job following the test, passing it and then defect when given the job is always better than failing the test. I’m not sure if this makes sense, although I suspect it might be the logic behind the trope, and perhaps it is a routine for undercover agents (who basically have 0 loyalty whatsoever)

On a second thought, since M does not have any information whatsoever before S passes or fails the test, one of r or v has to be exogenous. While in the formulation of the game, it appears that r is exogenous (since M has no information in the first round), I think it is more realistic to assume that *v* is exogenous.

M knows* v* at the beginning of the game, and S does not know v until the start of the second round.

S knows* b*, which M will never know for sure.

Now M’s goal becomes: set *r *(ranging from excessive drinking to kick a dog to shoot your spouse), such that if S passes the test, the expected utility of M of passing *v* to S is positive. (In this simple formulation, it means M believes *tv<b* is more likely than not)

Yet I still haven't figured out how to solve this game yet, good on me.

(Afterthought: I think this is definitely a rather primitive attempt in making a principal-agent game, and it is probably reinventing a wheel somewhere, not exactly the "screening game" but maybe something close. And my lack of studying in game theory specifics is showing, so despite my best efforts I cannot find any clue for solving this game)

I guess that will be the end of my struggle into game theory today, until I got enough time to revisit the textbooks.

Comment removed until I can figure out getting spoilers to work

If you are using the draft-js editor, it should just be ">!" on a newline followed by a space.

Example:

This is in a spoiler

Thanks

Spoilers? That sounds intriguing, I'll wait :)

Thanks for posting, I had fun trying to solve it and I think I learned a few things.

My solution is below (I think this is correct but I’m no expert) but I’ve hidden it in a spoiler in case you’re still wanting to figure it out yourself!

M has preference order of 2a>1>2b. He wants to set r such that if S has b>tv then S will pass the test and then remain loyal. If S has b<tv then M wants S to fail the test and therefore not get the chance to defect in round 2. It is common knowledge that this is what M wants.

Starting by making S’s Payoff for 2b less than that for 1 gives a formula for r:

v−b−r<−b

v<r=v+ϵ for some small positive ϵ

With this value for r, S’s payoff matrix becomes:

1. −b

2a. −vt−ϵ

2b. −b−ϵ

We can see that if vt−ϵ<b then S’s best payoff is obtained by choosing 2a. Otherwise his best payoff is 1. This is exactly what M wants - he has changed S's payoffs to make S's preference order the same as his to the greatest extent possible.

Due to M's preference being common knowledge, S knows that M will choose this value of r and therefore knows what v is before he chooses whether to pass the test (v=r−ϵ) and can choose between the three options simultaneously.

This is an interesting result as M's decision on r does not depend on the tax rate - he must always set an obedience test to be slightly more aversive than the entire value that is at stake. The tax rate only affects whether S will choose to pass the test.

Thanks, the final result is somewhat surprising, perhaps it's a quirk of my construction.

Setting r to be higher than v does remove the "undercover agents" that have practically 0 obedience, but I didn't know it's the optimal choice for M.

I wonder what would happen if one were to remove b and play the game iteratively. The game stops after 50 iterations or the first time S fails the test or defects.

b is then essentially replaced by S’s expected payoff over the remaining iterations if he remains loyal. However M would know this value so the game might need further modification.

I think we should still keep b even with the iterations, since I made the assumption that "degrees of loyalty" is a property of S, not entirely the outcome of a rational-game-playing.

(I still assume S rational outside of having b in his payoffs)

Otherwise those kind of tests probably makes little sense.

I also wonder what happens if M doesn't know the repulsiveness of the test for certain, only a distribution of it (ie: CIA only knows that on average killing your spouse is pretty repulsive, except this lady here really hates her husband, oops), could that make a large impact.

I guess I was only trying to figure out whether this "repulsive loyalty test" story that seems to exist in history/mythology/real life in a few different cultures has any basis in logic.