Consider a simple decision problem: you arrange a date with someone, you arrive on time, your partner isn't there. How long do you wait before giving up?
Humans naturally respond to this problem by acting outside the box. Wait a little then send a text message. If that option is unavailable, pluck a reasonable waiting time from cultural context, e.g. 15 minutes. If that option is unavailable...
The toy problem was initially supposed to help us improve ourselves - to serve as a reasonable model of something in the real world. The natural human solution seemed too messy and unformalizable so we progressively remove nuances to make the model more extreme. We introduce Omegas, billions of lives at stake, total informational isolation, perfect predictors, finally arriving at some sadistic contraption that any normal human would run away from. But did the model stay useful and instructive? Or did we lose important detail along the way?
Many physical models, like gravity, have the nice property of stably approximating reality. Perturbing the positions of planets by one millimeter doesn't explode the Solar System the next second. Unfortunately, many of the models we're discussing here don't have this property. The worst offender yet seems to be Eliezer's "True PD" which requires the whole package of hostile psychopathic AIs, nuclear-scale payoffs and informational isolation; any natural out-of-the-box solution like giving the damn thing some paperclips or bargaining with it would ruin the game. The same pattern has recurred in discussions of Newcomb's Problem where people have stated that any miniscule amount of introspection into Omega makes the problem "no longer Newcomb's". That naturally led to more ridiculous use of superpowers, like Alicorn's bead jar game where (AFAIU) the mention of Omega is only required to enforce a certain assumption about its thought mechanism that's wildly unrealistic for a human.
Artificially hardened logic problems make brittle models of reality.
So I'm making a modest proposal. If you invent an interesting decision problem, please, first model it as a parlor game between normal people with stakes of around ten dollars. If the attempt fails, you have acquired a bit of information about your concoction; don't ignore it outright.