aphyer — LessWrong

LESSWRONG
is fundraising!
LW

aphyer — LessWrong

D&D Sci Thanksgiving: the Festival Feast Evaluation & Ruleset

Two of the four dish combinations with Quality=20 had >16 Quality in expectation.

This is true, but is actually something I looked into making the scenario. The average score of 'pick a random 20-quality feast from the dataset' was 15.38, which players did successfully beat.

There was a related writing-constraint that came from me pushing to simplify the ruleset a bit. The original envisioned ruleset was going to give players an additional rule limiting which dishes they were allowed to include.^[1]

This would have let me give a substantially larger dataset without worrying that grabbing the best-scoring thing out of the dataset would trivially solve the scenario - if 6-7 dishes were banned, it would be easy for none of the top-scoring feast to be allowed for you, and/or for the best-scoring feast you were allowed to be a single bit of random luck that would betray you if you repeated it.

When I removed that rule, I needed to cut down on the dataset size to avoid that being a trivial solution, which is what led to the data-starvation. Overall I think something like that wouldn't have been worth the complexity - just telling players they can include whatever dishes they want is simpler and also feels more realistic in context. Open to other views on that, though.

^{^}
I had a whole bunch of excuses lined up for this too! One of your companions gets seasick and doesn't want to go hunt Kraken...another is Good-aligned and will be angry if you kill a Pegasus...

D&D Sci Thanksgiving: the Festival Feast Evaluation & Ruleset

aphyer14d20

Huh. I believe I tagged them both the same way, but I don't get tag notifications on my own posts. Can someone who isn't me comment on whether they got the notification?

D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues

aphyer17d20Review for 2024 Review

I'm very proud of this scenario. (Even if you're confident you aren't going to play it, I think you could read the wrapup doc and in particular the section on 'Bonus Objective' so you can see what it involved).

It accomplished a few things I think are generally good in these scenarios:

There was underlying structure that players could uncover, which created emergent complexity in the output but made sense with the theme once the underlying ruleset was revealed/discovered.
Human thought about e.g. the theme and what patterns would be reasonable to observe was valuable, the puzzle was not optimally-solved just by feeding the data into a model and calling it a day.
Multiple levels of solution were possible, from a decent solution with little effort up to a more-involved solution that went further and dug into the underlying structure.

And also it managed to trick many players with a surprising-yet-thematic twist :P

D&D.Sci Long War: Defender of Data-mocracy Evaluation & Ruleset

aphyer18d20

...oops. It turns out that efficiently solving the Knapsack Problem is hard.

On looking in, it looks like my confusion with python variables means that, while your soldiers will always find a valid solution if it's reachable by a few straightforward rules, they will sometimes fail to find the solution if it wouldn't be reachable like that.^[1]

This doesn't affect the performance of any submitted team (since the teams were evaluated using the same code that the dataset was generated with), but it does mean that the underlying ruleset was messier and less derivable than I'd hoped...sorry :(

^{^}
In detail: when your soldiers can't find a solution using simple rules, they were supposed to list each possible target for their biggest shot, and make separate branches for each of those. However, a bug means that the code execution for the first branch removes shots from their available list for the later branches, and so all the later branches are doomed. Therefore they will win only if the first branch works.

D&D.Sci Long War: Defender of Data-mocracy

aphyer19d20Review for 2024 Review

I have mixed feelings about this scenario.

I was proud of the underlying mechanics, which I think managed to get interesting and at-least-a-little-realistic effects to emerge from some simple underlying rules.

The theme...at least managed to make me giggle to myself a little as I was writing it.

When players submitted answers to this, though, several people got tricked into getting themselves killed. Out of five answers, two players took extremely safe approaches. Of the three players who were more daring, one submitted an excellent answer while two managed to trick themselves into submitting answers that were worse than random.

From a certain point of view, this is a valuable learning experience, which could teach people not to take drastic risks on limited data.

But I feel like other scenarios in this genre may have taught that lesson better without shooting quite so many players in the foot.

Seemingly Irrational Voting

aphyer1mo60

Another downside of this strategy is that a political faction not currently perceived by voters as 'in power' has an incentive to use any power they do have to actively worsen the lives of voters, who will blame their opposition.

Beware boolean disagreements

aphyer1mo61

Your boolean disagreement is relevant because it's actionable. Suppose that:

Alice thinks calling is +$100 EV
Bob thinks calling is +$10 EV
Claire thinks folding is +$10 EV
David thinks folding is +$100 EV

In this case, Bob is much closer to Claire than to Alice in terms of their beliefs. But Bob agrees with Alice about the correct action, which is often the thing where disagreement actually matters.

(Politics-related examples are left as exercises for the reader).

The Bughouse Effect

aphyer1mo93

I think this is definitely an effect, but I do not think that Bughouse would have been my first example for 'a game where you need to cooperate with random other people online and can be sabotaged by their inexplicable ineptitude.'

See e.g. this Reddit meme.

Your Clone Wants to Kill You Because You Assumed Too Much

aphyer1mo80

Doris wasn’t self-sacrificing. Amaryllis, now there was a woman who would be able to make a suicide bomber clone. If Doris Finch tried it, her clone would, at best, start up an argument about which of them should be the one to do it, or try to spawn another clone to do the work for her.

Do not hand off what you cannot pick up

aphyer1mo52

This does not sound to me like good advice in general? It could work with a small, driven team on a single focused project who wants to be sure everyone has hands on everything. But in general, specialization is an extremely powerful tool that we use to accomplish things we cannot accomplish alone. I would not benefit from insisting on understanding the whole fertilizer production supply chain before I could eat breakfast.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments