This is an entry in the 'Dungeons & Data Science' series, a set of puzzles where players are given a dataset to analyze and an objective to pursue using information from that dataset.
You were saddled with debt, and despair, and regret;
But you left it behind to embark,
With a visiting ship who were planning a trip,
Hunting some strange sea-beasts they call . . . “Snark”?
(After climbing aboard and departing the shore,
Your life is if anything worse.
The grog makes you groggy; the sea makes you soggy;
The songs leave you thinking in verse.)
Snark-hunting, you find, is a peaceful pastime.
By now, every crew knows the way,
To - with ease! - guarantee their success and safety,
As they seek, and they lure, and they slay.
A single exception proves the above rule:
While with most Snarks, Snark-hunting’s a breeze,
If your Snark is a Boojum, it hunts you right back,
And slays you with similar ease.
When you learn the detail that a Snark-hunt can fail,
You speak with the crew and insist,
On wielding Science skills to discern which Snarks kill,
And becoming the ship’s Analyst.
(The Butcher and Baker and Barnacle-Scraper,
All tell you that simply won’t do.
“You’ll be telling us what is a Boojum, what’s not;
Thus, you’re our Boojumologist; true?”
When you meekly protest you’ve not earned that address,
They rebut: “As you clearly percieve,
The Bellman can’t ring, the Belter can’t sing,
And the Beaver knows not how to Beave.”)
Notwithstanding contention viz naming conventions,
Your task is as clear as could be:
Using data from trips made by similar ships,
Choose which Snarks to fight, which to flee.
If you post a selection of Snarks before the deadline, and your selection is on the efficient frontier – that is, if you’re the first to submit that selection, and if no other submission has both better EV and better survival probability – you will be entitled to receive [UNSPECIFIED BENEFIT (UNDERWHELMING)] at [UNSPECIFIED TIME (DISTANT)]. Every player can make one submission; if you post multiple plans, please specify clearly which of these is for the Bonus Task.
I’ll post an interactive you can use to test your choices, along with an explanation of how I generated the dataset, sometime on Monday the 12th. I’m giving you nine days, but the task shouldn’t take more than an evening or two; use Excel, R, Python, recollections from a previous timeloop, or whatever other tools you think are appropriate. Let me know in the comments if you have any questions about the scenario.
If you want to investigate collaboratively and/or call your decisions in advance, feel free to do so in the comments; however, please use spoiler tags or rot13 when sharing inferences/strategies/decisions, so people intending to fly solo can look for clarifications without being spoiled.
I did not get make a chance to work on this but I really want to. I commit to making my own solution before Monday January 9th and to avoid looking at the posted solution until I've made my own.
I tried for a while to identify
the histogram of snarks by waking-time as a mixture of gaussians
but was unable to make much progress; my guess is that either
there are 2+ classes with very high variance, polluting everything, or maybe some snarks have a different distribution with a longer right tail than normal.
I did note that
Looks like we don't hunt Crumbling and are strongly biased against hunting Blunt, so maybe I'll just keep to conventional wisdom there. Of the remaining, very close to exactly 3% aren't hunted, with no discernible correlations? Maybe that's just the baseline "don't hunt this" chance.
So I fell back to
First remove all Crumbling and Blunt, then Just pretend you can treat every variable as independent, then pretend you can treat every pair of variables as independent after accounting for first-order effects. Grab log-likelihoods and smush them all together.
Which led me to
Not sure how many to hunt, but in this order:['V', 'Y', 'G', 'P', 'Q', 'L', 'H', 'W', 'C', 'M', 'B', 'N', 'J', 'R', 'D', 'X', 'K', 'F']The two suggested criteria would lead to either['V', 'Y', 'G', 'P', 'Q', 'L'] (maximize my survival), or['V', 'Y', 'G', 'P', 'Q', 'L', 'H', 'W', 'C', 'M', 'B', 'N', 'J', 'R', 'D'] (maximize snark count EV)Had I submitted in time, I probably would have chosen to stop at either H, W, or C, because my estimated boojum probabilities have an inflection point there. If I were trying to "beat" everyone else, I'd have stopped at C; if I were ignoring everyone else, I'd have stopped at H, which, interestingly, is only one more snark than the bare minimum.
Trying to pick a word of different length to other people for prizes, my final answer is
Didn't find much out, but a couple notes:
Taste generally seems to be upstream of other factors, and to screen them off a bit, e.g. later-waking Snarks are better but a lot of this effect is that later-waking Snarks are more likely to have good tastes.
Crumbling Snarks are never hunted at all. We don't have any idea what Crumbling means, so can't tell for sure whether this is 'they will definitely eat you and people know this' or 'people are silly', but they do seem to look bad on some other metrics and I've avoided them to be safe.
As hinted in my other comment, I believe that this problem is simply a gaussian mixture model.
I assume there are d species of snarks. Species i is defined by a set of variables:
On top of those variables, there is also a hidden matrix A: Ai,j=1 if and only if snark i belongs to species j. We want to estimate all of those numbers
I must confess I don't really remember how the approach I implemented is called, but basically it has to do with alternating estimations:
This approach is super slow, especially because I am trying several numbers of species d, I will reply to this comment when I get results
So, how many species are there?
Hum, about 12 ? Probably at least 8, maybe up to 20, or even more, I don't really know
One interesting point, it looks like there is one species, the Goojam, (or maybe 2 Goojam species), which always attacks, while all other species never attack. I.e. k only contains ones and zeros. This is not something I anticipated, it appeared when running my code, but it makes sense, so this might confirm the approach
BPGYHQ should be the set of 6 letters minimizing the probability of death
The letters, in increasing order of danger, are BPGYHQTSVINRDMKLOJUACWXEF, with BPGYHQTSVINRDMKLOJUACW (without XEF) being the solution maximizing the expected return, at least according to my models
If this is true, the pareto frontier is exactly the set of prefixes of this word, but since one can only submit one word for the bonus task, I will submit BPGYHQTS, a word of length 8 with a very low probability of death
It is remarkable to see that I find similar results to Yonge, despite having wildly different approaches!
I haven't had a chance to analyse any effects that depend on Waking-Time yet. But based on the other data every other Snark that was hunted that matched one of B/G/H/P/Q/Y was successfully hunted so if this was for real I would play it safe and go with:
B G H P Q Y
To maximise value:
A B C D G H J K L M N P Q R V W Y
Looks like it will yield an expected value of just over 14, albeit with an uncomfortably low 84 percent chance of survival. This is my provisonal entry to the bonus task if I don't get a chance to analyse Waking-Time dependant effects.
(ordered from best to not-as-best looking to me)
my total ordering from best to worst would have been (this is not my selection:)
where I think EV starts to drop after about 19 picks
further commentary to be included in reply
baseline was just the numbers of snarks vs. the number of boojums for cases where all data was the same except waking time, then I fudged based on:
Also, I tried to figure stuff out about the nature of the dataset but didn't end up with particularly actionable information; I did come to a strong suspicion based on comparing the Crumbling data with the Blunt data that the late Crumbling ones are boojums and the early ones are snarks, but I wasn't confident enough in this to actually pick any Crumbling ones.
Also, early on I was making a poem in homage to the OP poem as I went along, which was kind of silly I guess, but included below. I didn't want to post it at the time since I hadn't finished either the poem or the analysis which I was writing about...and it is still not updated with my later analysis, but posting anyway. I never did follow up the issue I was reporting on at the end...
The Snarks, so it seems, know just when to wake up;
From one beyond seven, the sun's nigh still up!
But the graph of those times, it hops and it hops:
A possible sign of some smaller sub-pops!
One-forty, two-thirty, two-fifty or so,
These are some peaks one can spot on the go.
But there looks like a broad one whose lateness abounds,
Though it could be a mishmash of all groups not found.
Four-fifty the top of another small mound.
And the Boojums, you see, don't quite seem to respect
The waking-time patterns that one would expect.
But enough of the times - on to matters of Taste!
It looks like the Crumbling ones all go to waste!
But the Crumbling wake times! - one peak is at two,
And it doesn't line up with the Snarks one can view.
Another one peaks around four thirty-five,
Are they from the broad set of Snarks found alive?
But that seems to peak quite a lot before then,
Unless I am missing a time where again
Some Snark group decidedly tends to revive,
So why take the risk if you like to survive?
The Snarks that are Artless are heaped to be broad
With some showing up in the two thirty pod,
Especially when they are Crisp as a pier,
But looking at those that are Neat and are Clear,
The peaks don't line up! Could it be that these here ...
The unsolved issue being - does each sub-population have a consistent wake time distribution regardless of other characteristics individual members may have, or do the other characteristics alter the wake times? I was implicitly assuming the former, but I didn't follow up to really convince myself that this was the case.
A few observations:
Big spoiler (but short)
This is a gaussian mixture
Big spoiler (but long)
Each observation of snarks contains several pieces of information:
Now that we know that, we could try to observe how each trait impacts the dataset. For example, the most common main phenotype is Hollow, yet Crisp. We could try to plot a histogram of the times at which each phenotype 0 appeared.
Aaaaand..... It's obviously the mixture of two gaussians. This hints several things: