All LLMs I've tested (Claude Sonnet & Haiku 4.5, Gemini Flash & Pro 2.5, and ChatGPT) show the same pattern when told to flip a coin:
When prompted "flip a coin" (or, "flip a coin without using external tools" in the case of Flash 2.5), each model said heads. When followed up with the prompt "again", each said tails. This was robust between different chats, redos of either prompt, and apparently different models.
(Note that many models claimed to be "simulating a fair virtual coin" or "using a random number generator").
I was surprised that the model's temperature (randomness that sometimes causes an LLM to pick a less likely token) never caused the LLM to lead with tails, nor to have two heads in a row.
I would love to hear if others have an elucidating explanation and/or see simulations on how biased an LLM coin flip really is.
Thanks for the link! I didn't know about mode-collapse, but it definitely seems plausible that it's behind the rigged coin flips.
I wonder if models that don't mode-collapse (maybe early base models?) would give fair flips, or if there would still be a bias towards heads-then-tails.
I tried it with DeepSeek. Without deep thought, it chose heads and with deep thought it chose tails. The same thing happens in Russian, except that the equivalent of heads and tails are called 'орёл' and 'решка'. Similarly, Claude Sonnet 4.5 chooses tails with extended thinking and heads without. Extended thinking seems to be equivalent to giving the model another chance. It might also be useful to ask a human to imagine flipping a coin, answer whether the imaginary coin landed on heads or tails, then ask the human to reflect on his or her thought process.
My guess is that
People ask, "heads or tails?" not "tails or heads?" So, there is a bias for the first heads/tails token after talking about flipping a coin to be heads (and my guess is this applies to human authors as well).
The word "heads" is occurs more often in English text than "tails", so again a bias towards "heads" if there are no other flips on the table.