All of abstractapplic's Comments + Replies

>You link to index C twice, rather than linking to index D. 

Whoops! Fixed now, thank you.

Reflections on my performance:

I failed to stick the landing for PVE; looking at gjm’s work, it seems like what I was most missing was feature-engineering while/before building ML models. I’ll know better next time.

For PVP, I did much better. My strategy was guessing (correctly, as it turned out) that everyone else would include a Professor, noticing that they’re weak to Javelineers, and making sure to include one as my backmidline.

Reflections on the challenge:

I really appreciated this challenge, largely because I got to use it as an excuse to teach myself ... (read more)

I think calling anything I did "feature engineering" is pretty generous :-). (I haven't checked whether the model still likes FGP without the unprincipled feature-tweaking I did. It might.)

Just recording for posterity that yes, I have noticed that

Rangers are unusually good at handling Samurai, so it might make sense to have one on my PVE team.

However, I've also noticed that

Rangers are unusually BAD at handling Felons, to a similar or greater degree.

As such,

I think it makes more sense to keep Pyro Professor as my mid-range heavy-hitter in PVE.

(. . . to my surprise, this seems to be the only bit of hero-specific rock-paper-scissors that's relevant to the PVE challenge. I suspect I'm missing something here.)

Threw XGBoost at the problem and asked it about every possible matchup with FRS; it seems to think

my non-ML-based pick is either optimal or close-to-optimal for countering that lineup.

(I'm still wary of using ML on a problem instead of thinking things through, but if it confirms the answer I got by thinking things through, that's pretty reassuring.)

Therefore, I've decided

to keep HLP as my PVE team.

And I've DM'd aphyer my PVP selection.

Just recording for posterity that yes, I have noticed that However, I've also noticed that As such,

My main finding thus far:

There's a single standard archetype which explains all the most successful teams. It goes like this: [someone powerful from the MPR cluster, ideally P], [a frontman, selected from GLS], [someone long-ranged, selected from CHJ]. In other words, this one is all about getting a good range of effective ranges in your team.

My tentative PVE submission is therefore:

Hurler, Legionary, Professor


  • I'm pretty sure there's some second-order rock-paper-scissors stuff going on that I'm not accounting for: Rangers seem better than Professor
... (read more)
Threw XGBoost at the problem and asked it about every possible matchup with FRS; it seems to think Therefore, I've decided And I've DM'd aphyer my PVP selection.

Well, that's embarrassing. Fixed now; thank you.

Reflections x3 combo:

Just realized this could have been a perfect opportunity to show off that modelling library I built, except:

A) I didn't have access to the processing power I'd need to make it work well on a dataset of this size.

B) I was still thinking in terms of "what party archetype predicts success", when "what party archetype predicts failure" would have been more enlightening. Or in other words . . .

. . . I forgot to flip the problem turn-ways.

Reflections on my performance:

This stings my pride a little; I console myself with the fact that my "optimize conditional on Space and Life" allocation got a 64.7% success rate.

If I'd allocated more time, I would have tried a wider range of ML algorithms on this dataset, instead of just throwing XGBoost at it. I'm . . . not actually sure if that would have helped; in hindsight, trying the same algorithms on different subsets ("what if I built a model on only the 4-player games?") and/or doing more by-hand analysis ("is Princeliness like Voidliness, and if ... (read more)

Good to know, thank you!  I think my main takeaway is that I am really bad at judging difficulty levels on these: I actually expected this scenario to be easier than the previous Dwarves & D.Sci scenario, but that one had three different near-perfect solutions while this one only had one noticeably-better-than-random solution. Long-winded and empirically incorrect argument that led me to that expectation follows: I was aware of the large number of possible characters - this is why the dataset ended up being so big, because I wanted to be sure it was large enough to allow simple analyses to work in spite of that.  One sample approach I tried out on my end as part of designing the scenario was this: * Take only teams that contained a Knight of Blood and a Mage of Time (but of any size). * For each possible classpect, find its winrate on those teams. This would have given you ~4k teams, with ~120 with each possible other classpect, which wasn't enough to get an optimal solution but would have been an excellent first step: * Page of Heart has a 59.46% winrate * Maid of Heart has a 57.01% winrate * Maid of Breath has a 51.55% winrate * ... * ... * Heir of Hope has a 27.10% winrate * Heir of Rage has a 26.85% winrate * Maid of Void has a 22.64% winrate As I envisioned things playing out: * Just running this approach and grabbing the two highest characters you could: * You would have picked a Page of Heart (3-9-3) and a Maid of Breath (2-12-1)  * This would have given you stats of 18-25-17, for a lowest stat of 17 and a 64% winrate.  * This isn't optimal (it over-invests in Friendship, since you've picked two different high-Friendship characters), but it's noticeably better than random. * Additionally, looking at the high/low scores might point you further in useful directions:  * For instance, Heart/Breath/Life showed up an awful lot in the top on a variety of different classes. * This might have pointed you in the direction of 'there's a
Reflections x3 combo: Just realized this could have been a perfect opportunity to show off that modelling library I built, except: A) I didn't have access to the processing power I'd need to make it work well on a dataset of this size. B) I was still thinking in terms of "what party archetype predicts success", when "what party archetype predicts failure" would have been more enlightening. Or in other words . . .

The jankiness here is deliberate (which doesn't preclude it from being a mistake). My class on Bayesianism is intended to also be a class on the limitations thereof: that it fails when you haven't mapped out the entire sample space, that it doesn't apply 'cleanly' to any but the most idealised use cases, and that once you've calculated everything out you'll still be left with irreducible judgement calls.

(I have the "show P(sniper)" feature always enabled to "train" my neural network on this data, rather than trying to calculate this in my head)

That's among the intended use cases; I'm pleased to see someone thought of it independently.

If it helps, I for one am completely okay with you taking the weekend.

I used the python package Pandas.

(I also tried Excel, but the dataset was too large to load everything in. In retrospect, I realize I could have just loaded in the first million rows - 2/3 of the dataset, more than enough to get statistically significant results from - and analyzed that, possibly keeping the remaining ~400k rows as a testing set.)

My solution for winrate maximization:

Add a Page of Mind and a Seer of Void. (this should get us slightly better than 50% chance of success)

My solution conditional on the new universe having both Space and Life (I think Time, Space and Life are prerequisites for a universe I'd like):

Add a Prince of Space and a Sylph of Life; if the gender situation doesn't line up with that, replace the Prince with an Heir and/or replace the Sylph with a Page. (this should get us slightly worse than 50% chance of success)

My attempt at ranking the party members, based on cha... (read more)

Several very strange trolls have replied to your comment with a series of rude, poorly-typed, confusing, and sometimes-profane messages. gallowsCalibrator says: arachnidsGrip says: adiosToreador says: terminallyCapricious says:

I just checked and while the other answers are perfect, math.log(2)**math.exp(2) is 0.06665771193088375. ChatGPT is off by almost an order of magnitude when given a quantitative question it can't look up in its training data.

1Gerald Monroe1y
Yep. 2/3 is still beyond most human savants, but it is a failure that the machine won't try to do "mental math" to see that it's answer is off by a lot. Obviously future versions of the product will just have isolated/containerized Linux terminals and python interpreters they can query so a temporary problem.

Thanks for putting in the time to make sense of my cryptic and didactic ranting.

You don't specify exactly how this second function can vary, whether it also has a few parameters or one parameter or many parameters?

Segmented linear regression usually does the trick. There's only one input, and I've never seen discontinuities be necessary when applying this method, so only a few segments (<10) are needed.

I didn't specify this because almost any regression algorithm would work and be interpretable, so readers can do whatever is most convenient to them.


... (read more)

There was a similar question a few months back; you may find the answers there helpful.

Nope. (Though since both that game and this one are weird administration-centric takes on Harry-Potter-style magical schools, I imagine there may have been some convergent evolution.)

It was, though fortunately that was just the random Houses they would have been Allocated to, and as such provides no meaningful information. Still, I've updated the file to not have that column; thank you.

Buy battery packs for charging phones so you can stay connected during a local blackout.

Wait. As . . . a software developer? Not as a Data Scientist, even though you have experience with ML?

At least as far as I know, Data work is better paid, uses more LessWrong-ish skills, and (crucially) is more of a frontier situation: Software ate the world a while ago, but Data is still chewing, so there's been much less time for credentialism to seep in.

(I'm from the UK, and it's been a few years since I did a 'normal' jobhunt, so I could be wrong about this as it applies today and on your side of the Atlantic. But even taking that into account, I notice I'm still surprised.)

I would love to be a Data Scientist even more than a software developer but I even more confused about how to find a job as a Data Scientist compared to working as a software developer. But you make a good point. Maybe I should be aiming to become a Machine Learning Engineer instead.

I'm curious as to what exactly you found there.

Briefly: I told my learner "assume there are two sources of income for Light Forest forts; assume they are log-linked functions of the data provided with no interactions between features; characterize these income sources."

The output graphs, properly interpreted, said back:

  • The larger source of income benefits greatly from Miners, benefits from the presence of every ore (especially Haematite), likes coal, and benefits from having one Smith.
  • The smaller source of income benefits from Woodcutters, benefits from ha
... (read more)
  Ah, I see!  That is a meaningful interpretation of reality, but rather than 'ore-based vs wood-based' I'd phrase it as a distinction between: * Staying inside and mining.  Benefits from all ores, and miners.  Makes only a few finished goods (smelting only with coal) but still benefits from higher coal level and one or two dwarves to smelt.  * Also getting outside and getting fuel.  Needs warriors to get you outside, benefits a lot from woodcutters as well, smelts whatever ores are available and crafts wood if it's left over.

Reflections on my attempt:

It looks like I was basically right. I could have done slightly better by looking more closely at interactions between features, ore types especially; still, I (comfortably) survived and (barely) proved my point to the King, so I'm happy with the outcome I got.

(I'm also very pleased by the fact that I picked up on the ore-based-vs-wood-based distinction; or, rather, that the ML library I've been building automatically picked up on it. Looks like my homebaked interpretability tools work outside their usual contexts!)

Reflections on ... (read more)

  I'm curious as to what exactly you found there.  Ore-based vs wood-based production wasn't really an intended distinction - rather, ore and wood were intended to be used together.  I added woodcrafting as an afterthought late in development (when Crafters were performing very poorly due to only having two craftable ores), and it still isn't a major source of income.  Even in your SHAMEFULLY ELFISH fort, your Crafters spend most of their time on Silver, and only make things out of wood when poor mining yield/alcohol-fueled crafting frenzies make them run out of silver.  The intended distinctions were: * Wood-based vs Coal-based fuel - you need one or the other, with Wood becoming more important the less coal you have. * Precious vs nonprecious metals - you need Crafters to work gold and silver, but Smiths to work iron/bronze/etc. There were two goals on my end from this, one of which succeeded and one of which did not: * Providing multiple levels of success.   * The aim was to make 'stay alive' a relatively easy goal that most players could accomplish with a little work. * 'Maximize value' was meant to be a harder goal that required more effort.   * I'm reasonably happy with how this went - most players hit 100% survival, and everyone found some effects (e.g. Farmers). * Providing a relatively clean environment where effects stood out, as a window into deeper mechanics that players could use to help understand the world-model.  This is a bit fuzzy, and didn't actually end up happening, so I'll flail at a couple vague examples of what I mean and hope I can convey it: * Several people noticed that there were two ways a fort could die ('starving' and 'digging too deep'.)  Some people noticed that Digging Too Deep happened a lot with 7 miners, fairly often wih 6, and rarely with 5 (I think simon was the most explicit here). * A thing that could have been noticed but wasn't was that all of the Digging Too Deep forts had either 7+ miners or 5-6 min

My allocations:

4x Miner, 2x Woodcutter, 2x Warrior, 2x Crafter, 1x Brewer, 1x Farmer, 1x Smith

The handful of (dubious) insights that no-one seems to have had yet, which motivate the (slight) differences between this setup and everyone else's:

  • We have enough data that it makes sense to filter out everything that isn't Light Forest Biome before doing any analysis.
  • There seems (?) to be a threeway synergy between Warriors, Woodcutters and Crafters in this biome. (Ad hoc explanation: Woodcutters cut down trees, Crafters make things from the wood, Warriors stop t
... (read more)

I liked this one a lot. In particular, I appreciate that it defied my expectations of a winning strategy: i.e., I couldn't get an optimal or leaderboard-topping solution with the "throw a GBT at the problem, then iterate over possible inputs" approach which won the last two games like this.

I think the Dark mana thing was a good sub-puzzle, and the fact that it was so soluble is a point in favor of that. It seemed a little unfair that it wasn't directly useful in getting a good answer, but on reflection I consider that unfairness to be a valuable lesson abo... (read more)

Huh. I kind of imagined it would be very important to understand Dark mana in order to e.g. assign elements to spells, and I don't know how deeper analysis would have been possible without doing that. To the extent that there was a specific 'intended' use for understanding Dark mana/trap for not doing so, it was this: Dark heavily anticorrelates with Light. Therefore, if you don't know about Dark mana, spells that use Dark will naively appear to be stronger when Light is weak and weaker when Light is strong. At the time of the scenario, though, both Dark and Light are low, and so if you haven't figured out Dark you could get misled into assuming that the Dark-using spells are all strong because Light is low.

You make a valid point, but . . .

basic encryption

The 'basic encryption' you have in minds is a Computer Thing. To the journalists in question, it was a New Computer Thing. If you're a Computer Person, you're probably underestimating the reticence associated with attempting New Computer Things when you're not a Computer Person.

much easier to use

I think that's false, albeit on the merest technicalities. The OTP system I have in mind is awkward and time-consuming ( . . . and probably inferior to Tor for Wikileaks' use case), but in terms of easiness it's some... (read more)

It's not just Tor. It's also "please use Signal instead of unencrypted email" that wasn't an easy sell. Part of Wikileaks work was not just receiving documents but coordinating with journalists who write stories about those documents. Partly, documents that resulted in people like Andy Müller Maguhn getting a hardware bug on his cell phone. As far as I know, the New York Times still runs on Slack instead of using a solution that provides end-to-end encrypion. Another illustrative episode was the Guardian journalists who famously published the password to decrypt the Wikileaks insurance file that contained the clear names of the carefully redacted diplomatic cables and documents about the wars. Teaching someone to use PGP is also something you can do with an average teenager. It's not that complicated. It's just awkward enough to use that it's very hard to get the people who share important information to do so. 

I took an ML-based approach which gave me radically different answers; the machine seems to think that

Matching currently-strong mana types is much more important than countering your opponent's choices.

As such, my new best guess is 

Fireball, Rays, Vambrace

Which should

give my master roughly 2:1 odds in favor.

Oh, also:

I deduced the existence of Darkness Mana, determined that it almost certainly has a value in the 16-18 range, and then . . . couldn't figure out any clever way to use that information when strategizing. I suspect I'm missing something here.

My provisional answer is:

Fireball, Levee, Hammer

This is supported by the reasoning that:

Levee (Fire/Earth) does a passably mediocre job protecting against Missiles (Earth/Water) and Fireball (Air/Fire); Fireball (Air/Fire) and Hammer (Light/Air) can both sneak past Solar (Fire/Light) by sharing an element.

And more prosaically by the fact that:

When I filtered the dataset to have Wizard A with the opponent's spell list, the spells which most raised Wizard B's winrate were those three.


I've had a hard time figuring out how to weight "counter the oppone

... (read more)
I took an ML-based approach which gave me radically different answers; the machine seems to think that As such, my new best guess is  Which should
Oh, also:

Misc. notes:

  • As we've all discovered, the data is most productively viewed as a sequence of 2095 8-byte blocks.
  • The eightth byte in each block takes the values 64, 63, 192, and 191.  64 and 192 are much less common than 63 and 191.
  • The seventh byte takes a value between 0 and 16 for 64/192 rows, weighted to be more common at the 0 end of the scale. For 63/191 rows, it takes a value between ??? and 256, strongly weighted to be more common at the 256 end of the scale (the lowest is 97 but there's nothing special about that number so the generator probably
... (read more)

If (like me) you're having a hard time reading the .bin format, here's a plaintext version of it in hexadecimal.

Confirmed and corrected; thank you again.

I would give you more time, but

you've already reached an optimal answer.

(Also, you can always just refuse to read the ruleset until you're done with the data.)

As DM, I can confirm that skills provided with the help of the Chaos Deity or Eldritch Abomination are identical to those provided by the goddess alone.


Nope. But (I'll edit the op to clarify this) the only effect a collaborator has is on which cheat skills are provided, so you could get the same effect as the Eldritch Abomination by choosing MR+AA, and get the same effect as the Chaos Deity by choosing randomly.


Currently in the UK, near London. Remote work is both a possibility and a preference.

I do a variety of Data Science/Analysis work, but my niche is producing unusually human-legible predictive models. Further details are on my website; let me know if you have comments or questions.

My client is running low on things I can usefully do for them, so this post is relevant again. In the ~year since I posted it, I've tested my interpretability-first modelling approach in real-world contexts, confirmed it works, and found that in a few - admittedly niche - cases, it can not only match but actually outperform industry-standard black-box models.

I have a website here which elaborates on what I'm offering. If you have any comments or questions, don't hesitate to message me. in fact a join?

What I was (haphazardly, inarticulately) getting at is that I never used any built-in functions with 'join' in the name, or for that matter thought anything along the lines of "I will Do a Join now". In other words, I don't think needing to know about joins was a barrier to entry, because I never explicitly used that information when working on this problem.

I found this challenge difficult and awkward due to the high number of possible response-predictor pairs (disaster A in province B is predicted by disaster/omen X in province Y with a Z-year delay), low number of rows (if you look at each province seperately there are only 1080 records to play with), and probablistic linkages (if events had predicted each other more reliably, the shortage of data would have been less of an issue).

This isn't necessarily a criticism - sometimes reality is difficult and awkward, and it's good to prepare for that - and I get t... (read more)

  My hope was that people would figure out the existence of the Population and Wealth sub-variables, at which point I think figuring out what effects omens had would have been much much easier.  Sadly it seems I illusion-of-transparencied myself on how hard that would be to work out.  People figured out a lot of the intermediate correlations I expected to be useful there (enough to get some very good answers), but no-one seems to have actually drawn the link that would have connected them.  My hope was that you would start with sub-results like:  * Famine in Year X means that Famine is unlikely in Year X+1 * Plague in Year X also means that Famine is unlikely in Year X+1 * Either Famine or Plague in Year X means that you are unlikely to Pillage a neighbor in Year X + 1 * Omens in Year X that predict a high/low likelihood of Famine in Year X+1 (e.g. Moon Turns Red/Rivers of Blood) also predict a high/low likelihood of you Pillaging a neighbor in Year X+1 and eventually arrive at the conclusion of 'maybe there is an underlying Population variable that many different things interact with'. (I even tried to drop a hint about the Population and Wealth variables in the problem statement.  I guess it's just much harder than I expected to make deductions like that.) in fact a join?

Insane, unendorsed bonus plan:

Spend most of the money on earthquakeproofing all nine provinces (including the six we don't own), to greatly decrease the probability of black dove sightings (doves and quakes correlate super hard for some reason), so they can't predict plagues, so no plagues happen.

In addition to this being inherently ridiculous, it's rendered extra-implausible by the fact that:

Doves seem to have been getting slightly more common over time, but plagues (and for that matter every other omen and disaster) haven't, suggesting that causality doesn't work that way.

I hope to do more digging and build off other people's comments later in the week, but my preliminary/solo answer would be:

Grainhoard and plagueproof in all three provinces.

On the basis that:

Doves in the previous year strongly predict global plague and weakly predict local famine; also, a crude "ignore every predictor, just look at average output of response variables lol" approach suggests that stockpiling grain is the highest-EV intervention.

However . . .

I haven't been able to figure out how pillaging works at all, and I really doubt it's as random/irrel

... (read more)
Insane, unendorsed bonus plan: In addition to this being inherently ridiculous, it's rendered extra-implausible by the fact that:

The title says 'nine black doves', but the dataset says Germania (and only Germania) had no black dove sightings in 1080. Was this intentional?

The dataset you've received is what was reported.

Strong-upvoted for reminding me how much I miss teaching/tutoring.

It's definitely a feature as well; the exact tradeoff comes down to personal taste.

Reflections on my attempt:

My PvE approach, as I mentioned, was to copy the plan that worked best in a comparable game: train a model to predict deck success, feed it the target deck, then optimize the opposing deck for maximum success chance. I feel pretty good about how well this worked. If I'd allocated more time, I would have tried to figure out analytically why the local maxima I found worked (my model noticed Lotus Ramp as well as Sword Aggro but couldn't optimize it as competently for some reason), and/or try multiple model types to see what they agr... (read more)

  I actually considered this to be mostly a feature rather than a bug?  I think real-world data science problems also benefit from having some knowledge of the domain in question. It's possible to apply data science techniques to a completely unfamiliar domain - you don't need to know anything about card games to notice that 'P' and 'S' showing up together, or 'L' and 'E' showing up together, improves your payoff function, and to try submitting an answer that has lots of 'P's and lots of 'S's in it. But if you have some level of domain knowledge, you have more ability to guess what kind of patterns are likely to appear, and to extrapolate details.  When you see that 'L' works well with 'D', 'E' and 'A' that doesn't tell you much else: when you notice that 'L' works well with all three of the cards that have long and bombastic names, that lets you start guessing things like 'there are some kind of costs to playing these powerful cards, and L helps you pay those costs to play them'.  This lets you guess in turn things like 'adding more Emperors might make the deck stronger against other decks like itself but weaker against faster decks' that would be very hard to pull out of the data directly without some amount of domain knowledge to help. This is part of the reason why I gave the cards names instead of just saying 'Card ID 1', 'Card ID 2', etc.  (The other part is of course to sound cooler :P)

I did things this way because my applied stats knowledge is almost entirely self-taught, with all the random gaps in knowledge that implies. Thank you for letting me know about Stan and related techs: while it's hard to tell whether they would have been a better match for my modelling context (which had some relevant weirdnesses I can't share because confidentiality), they definitely would have made a better backup plan than "halt a key part of your organization for a few days every time you need a new model". I'll be sure to look into MCMC next time I deal with a comparable problem.

(Typo: 'Cornelis'.)

Fixed, thank you again.

My deck:

2x Angel, 3x Minotaur Hooligan, 3x Pirate, 4x Sword.

My reasoning:

Absolutely no reasoning was applied in reaching this conclusion; all my attempts to solve this one analytically met dead ends. Instead, I copied the ML-based approach gjm won Defenders of the Storm with - except using gradient descent to search deckspace instead of trying all possible options - and got an answer I have no way to explain or evaluate. I'm very curious to see if this works!

Misc insights:

  • As a general rule, a more diverse deck leads to a better outcome. (Our opponent has t
... (read more)
Load More