Finding the Mole: Bayesianism is Hard

laniakea

I first encountered LessWrong a couple months ago, and since then I've been a regular reader of posts here, including parts of the Sequences. Bayesianism is a major topic in them, and I wanted to try it myself. An ideal candidate was the Flemish TV show 'The Mole' (Dutch: 'De Mol').

In this post, I want to share my methodology & results, but I also want to ask for advice. I consider my results to be mixed at best, and I would really like tips & feedback so I can do better next time.

If any fellow Flemish are reading this who also watched 'The Mole', I'm curious to hear whether you got it right and how you went about it.

Concept of the TV show

The TV show 'The Mole' is very popular in Flanders^[1]. The concept goes as follows: ten normal people ("candidates") travel somewhere and have to complete tasks to earn money for the group. One of them is the mole, who tries to make the tasks fail without people noticing that they are the mole. At the end of each episode, the candidate who knows least about the mole's identity is eliminated.

As a viewer, you also do not know who the mole is, and can have great fun figuring it out (you can't make money from being right though, betting on it is illegal). This is my eleventh year watching the show, but I am only slightly better than random at figuring out who the mole is. In the last episode three candidates are left, and I get it right around 30% to 50% of the time.

Here are some more features of the TV show you should know before I get to the Bayesianism:

The mole is in league with the show's producers, and has foreknowledge of everything that will happen
Sometimes, candidates can choose between money for the group and a personal advantage that makes them less likely to be eliminated.
Some normal candidates intentionally appear suspicious, because if other people (wrongly) assume they are the mole, they themselves are less likely to be eliminated.
The candidates often have to divide themselves into groups that will do different tasks.

To make that more concrete, here's an example of a task from this year's episode seven:

The four candidates will play a football match against a lot of Portuguese children. If they win, they earn €4000 for the group.
The number of opponents is under their control. Depending on how they do on a collection of small tasks before, the number of children is between fifteen and one-hundred.
These small tasks involve multiple candidates, with multiple avenues for failure. Example:
- One candidate has to dribble the ball along some cones to another candidate who needs to score.
- Their time limit is set by a third candidate's performance: as long as that candidate keeps correctly identifying types of balls (mainly fruits & sports balls) by touch alone, the clock keeps running. A mistake ends it.
- On this small task, the group succeeds two out of four times.
They are able to reduce the number of opponents to fifty-five. The match however is lost two goals to one, as only two out of the four candidates have any skill at football.

My attempt at Bayesianism

I started by designing a spreadsheet^[2] that could process two kinds of Bayesian updates:

binary updates: a group of candidates did something and the others did not
group partition updates: the candidates were divided into groups and did different things

For binary updates, I could just use the classical Bayes formula. is the chance that person is the mole, is the event it updates upon and the group of candidates involved in . I just have to fill in my subjective likelihoods and .

Let me illustrate with the football task I explained earlier:

The small task beforehand to reduce the number of opponents was important, as it greatly influences the odds of winning the match later on.
It failed twice because they ran out of time. Maxim & Julie were responsible for that, as they made a mistake in the ball-recognizing.
I introduce an update on Maxim and an update on Julie. For both I set the base rate of failure, , to 50%, since the task failed two out of four times. I estimated the likelihood of failure if the mole was involved, , at 75%.

My spreadsheet can also update on group partitions. This was not necessary for the football task, but does happen ~twice per episode.

In my formula for group partitions, is the group partition, is the group of candidate , and the number of people in the group of candidate . For every group I had to fill in , my subjective likelihood that the mole would be in that group.

I used the following guidelines for filling in the likelihoods:

For most updates, the probabilities shouldn't go lower than 10% or higher than 90%. It is possible a candidate is intentionally doing something suspicious, or that the mole is intentionally being unsuspicious.
If there are multiple attempts (like in the example small task about recognizing balls), is determined by which percentage of those attempts succeed.
I did not update on cases of egoism (sacrificing the group for a higher chance to survive the end-of-episode elimination), as I thought it is difficult to get right. The normal candidates range on a spectrum from very unwilling to very willing to engage in egoism. The mole will sometimes be egoistic as it loses the group money, but if other candidates are already doing that the mole can just pretend to be virtuous.

My results

Here's the fun part: the graphs. You can view my full spreadsheet here (it is in Dutch). Wout, that's the dark blue line, was revealed to be the mole at the end of the last episode.

I consider my results to be mixed at best, and definitely worse than my expectations. My spreadsheet made these errors:

For multiple episodes, it wrongly thought Yannis was the mole with odds up to 86%.
The real mole, Wout, only got above 10% odds at the end of episode 3.
At the end of the second to last episode, my spreadsheet had two options on almost equal probabilities, Wout and Maxim.

Not everything was bad though. My spreadsheet did two useful things:

It eventually singled out the real mole, with 90% odds by the end of the last episode. This is mainly because the last episode was very clear.
The odds of all but three candidates were very low after episode three. Just this info alone would already make it a lot easier to find the mole, as you'd only need to focus on the three remaining candidates.

To benchmark my performance, let's compare to the suspicions of others:

Among the candidates, Wout was not suspected at all for the first couple episodes. By the later episodes, three of them got on the right track.
Of my classmates who made predictions, two correctly suspected Wout from around episode 3 or 4, while one wrongly suspected Isabel.
This slightly sensationalist newspaper held a poll before the last episode where Julie was the prime suspect. My spreadsheet correctly had her at ~0.1%.
This fan forum (n ≈ 100) quickly noticed that Wout was the likely mole. In their poll, he received a plurality after episode 2, a majority after episode 6, and 70% before the final episode. This is significantly better than my attempt. Curiously they didn't at all suspect Yannis, who was my spreadsheet's prime suspect until he got eliminated.

Wanting to do better

As this was my first attempt at Bayesianism, I would very much appreciate any and all tips & feedback on how I can do better. I feel like this problem is really possible to solve, and am committed to trying again next year. It's also just a very fun TV show, even more so because I can discuss it with my peers.

I already know some smaller improvements to implement, like including updates on egoism^[3]and making my likelihoods more thought-through. I think suboptimal likelihoods were the main reason why Yannis was wrongly the prime suspect of my spreadsheet for many episodes, and that my guidelines were one of the causes^[4]^[5]. Therefore, I either need to get new guidelines or trust myself to go without.

To conclude, this was a fun but also very useful project, as I got my first hands-on experience with Bayesianism^[6]. However, it didn't go as well as I expected. I kind of anticipated more easy-to-get "awesomeness". Therefore this post's title: Bayesianism is Hard.

^{^}
It's popular enough to be conversation topic among me and my fellow students regardless of what happens in it. The only show with a similar appeal is Eurovision.
^{^}
This was great fun, but it also took me embarrassingly long before I was able to weed out all obvious mathematical errors (the odds not summing to 100% was a good hint for when I did something wrong). Figuring out the math and building the spreadsheet took me around ten to fifteen hours over a couple days.
^{^}
This season, the mole was the first mover in a bout of egoism that cost the group €5000.
^{^}
I can point to an update that, in retrospect (but trying to correct for hindsight bias), had bad likelihoods due to my guideline to clip likelihoods between 10% & 90%. There are a couple for bad ones due to the guideline on how to determine P(E|M∉K).
^{^}
Nevertheless, I still can't wrap my head around how the fan forum found Yannis one of the least suspicious candidates.
^{^}
I think TV shows like this one are a very accessible & quite good way to practice applied rationality. They are engaging, you have to apply Bayesianism & rational reasoning, and by the end there is an easy way to evaluate how you did.

35

Finding the Mole: Bayesianism is Hard

35

Concept of the TV show

My attempt at Bayesianism

My results

Wanting to do better

35

35