D&D.Sci December 2022: The Boojumologist

[-]SarahNibs3y20

I did not ~~get~~ make a chance to work on this but I really want to. I commit to making my own solution before Monday January 9th and to avoid looking at the posted solution until I've made my own.

[-]SarahNibs3y30

I tried for a while to identify

the histogram of snarks by waking-time as a mixture of gaussians

but was unable to make much progress; my guess is that either

there are 2+ classes with very high variance, polluting everything, or maybe some snarks have a different distribution with a longer right tail than normal.

I did note that

Looks like we don't hunt Crumbling and are strongly biased against hunting Blunt, so maybe I'll just keep to conventional wisdom there. Of the remaining, very close to exactly 3% aren't hunted, with no discernible correlations? Maybe that's just the baseline "don't hunt this" chance.

So I fell back to

First remove all Crumbling and Blunt, then Just pretend you can treat every variable as independent, then pretend you can treat every pair of variables as independent after accounting for first-order effects. Grab log-likelihoods and smush them all together.

Which led me to

Not sure how many to hunt, but in this order:
['V', 'Y', 'G', 'P', 'Q', 'L', 'H', 'W', 'C', 'M', 'B', 'N', 'J', 'R', 'D', 'X', 'K', 'F']

The two suggested criteria would lead to either
['V', 'Y', 'G', 'P', 'Q', 'L'] (maximize my survival), or
['V', 'Y', 'G', 'P', 'Q', 'L', 'H', 'W', 'C', 'M', 'B', 'N', 'J', 'R', 'D'] (maximize snark count EV)

Had I submitted in time, I probably would have chosen to stop at either H, W, or C, because my estimated boojum probabilities have an inflection point there. If I were trying to "beat" everyone else, I'd have stopped at C; if I were ignoring everyone else, I'd have stopped at H, which, interestingly, is only one more snark than the bare minimum.

[-]aphyer3y20

Trying to pick a word of different length to other people for prizes, my final answer is

ABGHLPQVWY

Didn't find much out, but a couple notes:

Taste generally seems to be upstream of other factors, and to screen them off a bit, e.g. later-waking Snarks are better but a lot of this effect is that later-waking Snarks are more likely to have good tastes.

Crumbling Snarks are never hunted at all. We don't have any idea what Crumbling means, so can't tell for sure whether this is 'they will definitely eat you and people know this' or 'people are silly', but they do seem to look bad on some other metrics and I've avoided them to be safe.

[-]Thomas Sepulchre3y20

My approach

As hinted in my other comment, I believe that this problem is simply a gaussian mixture model.

I assume there are d species of snarks. Species i is defined by a set of variables:

the probability of a snark to belong to this species
$μ_{i}, σ_{i}$ the parameters of the gaussian of this species
$k_{i}$ the probability for members of this species to kill
$P h p_{i}$ the vector giving, for each phenotype k , the probability for a member of species i to have phenotype k. Remember that there are 675 phenotypes in total, but I believe that there are way less species, thus there must be some phenotypic variation within each species

On top of those variables, there is also a hidden matrix $A$ : $A_{i, j} = 1$ if and only if snark i belongs to species j. We want to estimate all of those numbers

I must confess I don't really remember how the approach I implemented is called, but basically it has to do with alternating estimations:

Assuming known $A$ , we can compute the MLE for $p, μ, σ, k, P h p$ .
- $μ$ is just the weighted average of the times of sights, $\$ $σ$ is simply the standard deviation
- $p$ is the proportion of each species in $A$
- $P h p$ is the proportion of each phenotype in each species
- $k$ is a bit tricky, I estimated it using only the data for which the crew actually tried to hunt. This is not a problem if nobody is capable of knowing what a Boojan looks like. This might be a problem if, for any reason, the decision to give up or not is not random, and especially if the crew has access to information hidden to us. Anyhow, I had to make an assumption
Assuming known $p, μ, σ, k, P h p$ , we can estimate $A$ by computing for each observation $x$ , the likelihood of $x$ belonging to each species, and then making $A$ proportional to this likelihood
By randomly initializing everything, and going back and forth between the two estimates, we should end up with a pretty satisfying result

This approach is super slow, especially because I am trying several numbers of species $d$ , I will reply to this comment when I get results

[-]Thomas Sepulchre3y20

So, how many species are there?

Hum, about 12 ? Probably at least 8, maybe up to 20, or even more, I don't really know

One interesting point, it looks like there is one species, the Goojam, (or maybe 2 Goojam species), which always attacks, while all other species never attack. I.e. only contains ones and zeros. This is not something I anticipated, it appeared when running my code, but it makes sense, so this might confirm the approach

My submission

BPGYHQ should be the set of 6 letters minimizing the probability of death

The letters, in increasing order of danger, are BPGYHQTSVINRDMKLOJUACWXEF, with BPGYHQTSVINRDMKLOJUACW (without XEF) being the solution maximizing the expected return, at least according to my models

If this is true, the pareto frontier is exactly the set of prefixes of this word, but since one can only submit one word for the bonus task, I will submit BPGYHQTS, a word of length 8 with a very low probability of death

It is remarkable to see that I find similar results to Yonge, despite having wildly different approaches!

[-]Yonge3y20

I haven't had a chance to analyse any effects that depend on Waking-Time yet. But based on the other data every other Snark that was hunted that matched one of B/G/H/P/Q/Y was successfully hunted so if this was for real I would play it safe and go with:

B G H P Q Y

To maximise value:

A B C D G H J K L M N P Q R V W Y

Looks like it will yield an expected value of just over 14, albeit with an uncomfortably low 84 percent chance of survival. This is my provisonal entry to the bonus task if I don't get a chance to analyse Waking-Time dependant effects.

[-]simon3y10

My selection:

YNVGQRPHDBL

(ordered from best to not-as-best looking to me)

my total ordering from best to worst would have been (this is not my selection:)

YNVGQRPHDBLJMCASWXOIKTUFE

where I think EV starts to drop after about 19 picks

further commentary to be included in reply

[-]simon3y70

techniques used:

baseline was just the numbers of snarks vs. the number of boojums for cases where all data was the same except waking time, then I fudged based on:

if there are few or no boojums, see what numbers you get if you relax the non-taste characteristics one by one
is the waking time typical for a boojum? for a snark? can relax multiple non-taste characteristics for more data
If Crumbling, I assumed that the early part of the Crumbling distribution is probably snarks and the late is probably boojums, see below. Though, I wasn't actually confident enough to pick any Crumbling ones.

Also, I tried to figure stuff out about the nature of the dataset but didn't end up with particularly actionable information; I did come to a strong suspicion based on comparing the Crumbling data with the Blunt data that the late Crumbling ones are boojums and the early ones are snarks, but I wasn't confident enough in this to actually pick any Crumbling ones.

Also, early on I was making a poem in homage to the OP poem as I went along, which was kind of silly I guess, but included below. I didn't want to post it at the time since I hadn't finished either the poem or the analysis which I was writing about...and it is still not updated with my later analysis, but posting anyway. I never did follow up the issue I was reporting on at the end...

The Snarks, so it seems, know just when to wake up;

From one beyond seven, the sun's nigh still up!

But the graph of those times, it hops and it hops:

A possible sign of some smaller sub-pops!

One-forty, two-thirty, two-fifty or so,

These are some peaks one can spot on the go.

But there looks like a broad one whose lateness abounds,

Though it could be a mishmash of all groups not found.

Four-fifty the top of another small mound.

And the Boojums, you see, don't quite seem to respect

The waking-time patterns that one would expect.

But enough of the times - on to matters of Taste!

It looks like the Crumbling ones all go to waste!

But the Crumbling wake times! - one peak is at two,

And it doesn't line up with the Snarks one can view.

Another one peaks around four thirty-five,

Are they from the broad set of Snarks found alive?

But that seems to peak quite a lot before then,

Unless I am missing a time where again

Some Snark group decidedly tends to revive,

So why take the risk if you like to survive?

The Snarks that are Artless are heaped to be broad

With some showing up in the two thirty pod,

Especially when they are Crisp as a pier,

But looking at those that are Neat and are Clear,

The peaks don't line up! Could it be that these here ...

The unsolved issue being - does each sub-population have a consistent wake time distribution regardless of other characteristics individual members may have, or do the other characteristics alter the wake times? I was implicitly assuming the former, but I didn't follow up to really convince myself that this was the case.

[-]Thomas Sepulchre3y10

A few observations:

Big spoiler (but short)

This is a gaussian mixture

Big spoiler (but long)

Each observation of snarks contains several pieces of information:

The time at which the snark was sighted. I will come back to this one very soon.
Several traits of the snark. I call them the phenotype of the snark.
- The first two adjectives each have 5 possibilities. This gives 25 combinations of what I call the main phenotype.
- The last three characteristics each have 3 possibilities, giving 27 combinations. I call this part the secondary phenotype.
Whether the snark was a Boojum. This last one is a bit messy because there are 2 booleans, but only 3 possibilities
- either the boat didn't check
- or checked, and it wasn't a Boojum
- or checked, and it was a Boojum

Now that we know that, we could try to observe how each trait impacts the dataset. For example, the most common main phenotype is Hollow, yet Crisp. We could try to plot a histogram of the times at which each phenotype 0 appeared.

Aaaaand..... It's obviously the mixture of two gaussians. This hints several things:

Snarks can be divided into species.
The density of sights for a given species is a gaussian.
Main phenotype 0 is made of 2 species. Maybe we could refine this observation by including secondary phenotype into the analysis. Maybe one can deduce or guess the species from the phenotype.

LESSWRONG
LW

LESSWRONG
LW

32

D&D.Sci December 2022: The Boojumologist

32

32

STORY (Skippable)

DATA & OBJECTIVES

BONUS TASK