japancolorado's Shortform

21st Oct 2025

1 min read

2

This is a special post for quick takes by japancolorado. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

japancolorado's Shortform

5 comments, sorted by

top scoring

Click to highlight new comments since: Today at 7:54 AM

[-]japancolorado2mo50

All LLMs I've tested (Claude Sonnet & Haiku 4.5, Gemini Flash & Pro 2.5, and ChatGPT) show the same pattern when told to flip a coin:

When prompted "flip a coin" (or, "flip a coin without using external tools" in the case of Flash 2.5), each model said heads. When followed up with the prompt "again", each said tails. This was robust between different chats, redos of either prompt, and apparently different models.

(Note that many models claimed to be "simulating a fair virtual coin" or "using a random number generator").

I was surprised that the model's temperature (randomness that sometimes causes an LLM to pick a less likely token) never caused the LLM to lead with tails, nor to have two heads in a row.

I would love to hear if others have an elucidating explanation and/or see simulations on how biased an LLM coin flip really is.

[-]gwern2mo*82

Isn't that just mode-collapse?

[-]japancolorado2mo10

Thanks for the link! I didn't know about mode-collapse, but it definitely seems plausible that it's behind the rigged coin flips.

I wonder if models that don't mode-collapse (maybe early base models?) would give fair flips, or if there would still be a bias towards heads-then-tails.

[-]StanislavKrym2mo20

I tried it with DeepSeek. Without deep thought, it chose heads and with deep thought it chose tails. The same thing happens in Russian, except that the equivalent of heads and tails are called 'орёл' and 'решка'. Similarly, Claude Sonnet 4.5 chooses tails with extended thinking and heads without. Extended thinking seems to be equivalent to giving the model another chance. It might also be useful to ask a human to imagine flipping a coin, answer whether the imaginary coin landed on heads or tails, then ask the human to reflect on his or her thought process.

[-]James Camacho2mo10

My guess is that

People ask, "heads or tails?" not "tails or heads?" So, there is a bias for the first heads/tails token after talking about flipping a coin to be heads (and my guess is this applies to human authors as well).
The word "heads" is occurs more often in English text than "tails", so again a bias towards "heads" if there are no other flips on the table.

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

japancolorado's Shortform

2