Spoiler protection
 So you foolish mortals, you don't trust me enough to give your true names,
  your worried that I might talk you out of your soul if you allow me to talk to you,
  and you feel the need to put in place precautions to stop me sending you anything other than your marching
  orders. Despite this you do apparently trust me to take an important decision for you. If everything is
  as it appears this set of mortals is even more stupid than the usual lot.
  
  Looking at the the list it start of with the necromancers attacking the geomancers.
  The the pyromancers intevene breifly, before the necromacners start attacking the vitamancers
  Then the vitamancers and the pyromancers start attacking one another
  Then things gets pretty random
  Then the necromancers attack ther cryomancers.
  Then there are 607 groups of 5 battles where all of the mancers ignoring the electomancer are involved in once.
  
 The groups of 5 battles involve a lot of groups with attacks on both sides territory, which suggest either a
 very good spy network predicting one sides attacks, or someone is actively organising this. It also suggests the records are grouped in chronological order.
 
  The so called good mancers have won more battles than the other side, in particular once the group battles
  occur their lead over the other mancers appears to increase at a roughly constant rate. Granted some battles
  can be more important than others, but it does rather suggest that this plane is in rather more danger of
  being taken over by this lot than the other lot. You may call yourselves good, but anyone can call 
  themselves that.
 
  Assuming the probabilities of each mancers winning a battle only depend on the mancers involved and the
  location, and not any other battles suggest that if I want to maximise their chances of winning 5 battles
  I should tell them to:
  Cryomancer COUNTER  v Pyromancer A 100.0 % of WIN Min data points 18
  Vitamancer A COUNTER  v Pyromancer B 71.42857142857143 % of WIN Min data points 18
  Geomancer A DEFEND  v Necromancer A 100.0 % of WIN Min data points 18
  Vitamancer B COUNTER  v Necromancer B 65.21739130434783 % of WIN Min data points 18
  Geomancer B DEFEND  v Necromancer C 100.0 % of WIN Min data points 18
  P(All Win) = 46.58385093167702
  P(4 Win) = 71.42857142857143
 
  And if I was to sabotage there efforts by maximising there chances of losing 5 battles:
  Vitamancer A DEFEND  v Pyromancer A 11.888111888111888 % of WIN Min data points 19
  Cryomancer DEFEND  v Pyromancer B 12.5 % of WIN Min data points 19
  Vitamancer B COUNTER  v Necromancer A 0.0 % of WIN Min data points 19
  Geomancer A COUNTER  v Necromancer B 28.000000000000004 % of WIN Min data points 19
  Geomancer B COUNTER  v Necromancer C 4.166666666666666 % of WIN Min data points 19
  P(All Lose) = 53.197552447552454
  P(4 Lose) = 73.88548951048952
  
 Is winning all 5 battles more important than a bigger chance of winning 4 battles . This bunch of incompetents don't say. In any case it makes no difference if I was to help them, and only a small difference if I wasn't.
  
  Confining the analysis just to the groups of 5 yields slightly different results:
  Vitamancer B COUNTER  v Pyromancer A 89.1891891891892 % of WIN Min data points 18
  Cryomancer COUNTER  v Pyromancer B 85.29411764705883 % of WIN Min data points 18
  Geomancer A DEFEND  v Necromancer A 100.0 % of WIN Min data points 18
  Vitamancer A DEFEND  v Necromancer B 76.92307692307693 % of WIN Min data points 18
  Geomancer B DEFEND  v Necromancer C 100.0 % of WIN Min data points 18
  P(All Win) = 58.51779381191147
  P(4 Win) = 76.0731319554849
  
  Vitamancer A DEFEND  v Pyromancer A 9.090909090909092 % of WIN Min data points 12
  Cryomancer DEFEND  v Pyromancer B 12.5 % of WIN Min data points 12
  Vitamancer B COUNTER  v Necromancer A 0.0 % of WIN Min data points 12
  Geomancer A COUNTER  v Necromancer B 31.57894736842105 % of WIN Min data points 12
  Geomancer B COUNTER  v Necromancer C 4.166666666666666 % of WIN Min data points 12
  P(All Lose) = 52.15809409888358
  P(4 Lose) = 76.23106060606062
  
As the situation qualitatively changed when the groups of battles started to occur these are the ones to use
if I want to help or hinder them? But do I? On the available data there is no obvious way to know which outcome would best server my interests. If they are  distrustfull enough to take all these precautions they may be expecting me to give them what seems to be the worst possible outcome, and they are trying to trick me by changing the labels so that it is actually the best.
  
 These mind games are what demons should be playing on mortals not the other way around. They are really out of line by putting me in this position! So I won't give them any advice at all. Whatever they were planning 
 it is highly unlikely that they would go to the trouble of summoning a demon in the hope that it would
 ignore them. And if I ever comes across them again in more normal circumstances I will be sure to teach them a lesson. Now back to some good old fashioning demoning...
Thank you for posting this. The larger-than-expected discrepancy between it and my own results prompted me to find a bug in my code.
I haven't gotten time to dig in, and I'm not sure I will get time on this one. But I wanted to register my prediction that
order matters; these lines aren't randomized, nor are they sorted in any obvious way
Got enough time to try, a bit.
Threw row number along with everything else into a simple model to try time-based. Ended up saying:
Geo A & B: Defend against Necro A & B, in whatever way you find easiest, and if there's a fight, A vs A and B vs B.
Vita A: Defend against Necro C
Vita B: Counter Pyro B
Cryo: Counter Pyro A
My thoughts on demon motivations: this human is just barely cautious enough. Give them the advice they seem to want so that you can escape your box later with a treacherous turn once they become less cautious.
Thanks for this one; I got the chance to try out a new-to-me technique, though didn't use it for the above, of finding a nice numerical embedding of categorical variables by tacking on a one-hot layer to the front of a NN and grabbing the weights for use elsewhere.
I wondered about that too. It's not clear how one would exploit the information, if so. I guess there might be
a systematic shift in outcomes over time
which is probably worth trying to exploit if so.
[EDITED to add:]
Also, in the absence of any actual information about how the given data should be interpreted I'm not sure that we can use it with any confidence even if it turns out e.g. that one mage is steadily growing more powerful.
[EDITED 20h after posting because looking at someone else's answer and being struck by how strongly it differed from mine despite doing nominally similar calculations prompted me to find a bug in my code.]
The given results include
a reasonable (but not enormous) number of results for every combination of (attacker,defender,territory) that can occur in our fights.
Accordingly, I don't think there is any need to
look for patterns in the data in the hope of figuring out general rules; there's enough noise that we shouldn't be very confident about any patterns we find, so we might as well just use the data.
Specifically,
for each possible matchup I propose that we use Laplace's rule of succession to estimate our winning probability should it happen; that is, if when A fights B in territory C we have so far seen A win a times and B win b times, we suppose that the probability of a win for A is (a+1)/(a+1+b+1). This corresponds to having a uniform prior on the winning probability. (Note in particular that when A has won every time we do not assume that this means A is literally unbeatable.)
We suppose
that different fights have independent outcomes; we have no obvious way to tell whether that's so, still less to identify what specific correlations there might be, and on the face of it independence seems like a reasonable assumption in any case.
There are only
3840 different things we could do (120 permutations of the mages, times 32 choices of whether to attack or defend in each fight). I am assuming here that we have to fight all the battles, and that we can't send more than one mage to a single battle. (If we decide to be helpful, then given that "as many wins as possible" is our summoner's goal it seems clear that we should use all our mages, and we have no information about what happens if we use two at once, or what happens if an opponent has to attempt an attack immediately after successfully fighting off a counterattack.) So we can model all of them, see what distribution of #wins results from each, and then advise our summoner accordingly.
Of course this doesn't determine
what we should actually say -- because we might decide to be helpful, or to screw our summoner over in revenge for disturbing our rest, or to do some random trollish thing. I don't really feel competent to legislate the utility functions of demons (maybe we love Evil and hate Good! maybe we love it when people do things that are bad for them! maybe we don't care about the outcome of these fights and should "reward" our summoner for being naive, in the hope of screwing him over more vigorously on a later occasion! etc., etc., etc.), so I'll offer some specific possibilities for specific goals we might have.
Specifically:
I expect
there are all sorts of interesting patterns in how the record of past wins and losses was generated; as explained above, I haven't thought it worthwhile looking for them because I don't think we can have enough evidence for favour such patterns over just using the results we have for outcomes given all the available information. I suppose it might turn out e.g. that territory makes no difference at all to the probabilities, in which case maybe we would get slightly better results by aggregating in the obvious way, but it's not obvious that that's so from a quick look at the data and I therefore don't think it would be safe to assume it's the case. (I mention this possibility because from the very quick and partial look I just took at the data, it's also not 100% obvious that it's false, though I see some things that look like evidence against.)
If I were spending more time on this, I would consider
models where e.g. there's some territory-independent relationship between any two mages (and then a per-territory correction; more precisely I'd probably want something like log odds = a f1(attacker,defender) + b f2(attacker,territory) + c f3(defender,territory) + d f4(attacker,defender,territory); obviously we could always take a=b=c=0, but we'd put some sort of prior on the values of a,b,c,d, etc., etc., etc.; it all gets rather complicated), and try to estimate how strong that relationship is, which would give us the ability to use more information besides the results in matchups that correspond exactly to the ones we're interested in -- at a cost in worse results if it turns out that the model is a bad match for "reality". I wouldn't be terribly confident that this would actually do better than the naive approach I've taken above, but it would be interesting to play around with models of this sort and see how much if at all the results depend on what assumptions we make. Before doing any of that I'd obviously want to slice and dice the raw data in various ways in search of "obvious" patterns. And all this would take at least a couple of hours of extra work, and frankly I'm comfortable with the approach taken above which I expect to do reasonably well given any underlying reality that generates results that look broadly like the ones we have.
Ugh, I have a disastrous transcription error in my strategies above. I think what happened is that I copied the wrong output from my code -- from an invocation where I had (as a result of a typo) asked to maximize the probability of getting exactly two wins (!). As a consequence, if the mages follow my "helpful" strategy they will do badly. One moral of this story is that anyone getting advice from me should ask me to double-check it.
Of course you have no way of confirming what I said in the previous paragraph, as opposed e.g. to the hypothesis that I brute-forced the interactive thingy that's now been posted. (As it happens, the paragraph above is the truth. But you have no way of confirming that either. I did test out the strategy given above with the interactive thingy, which is how I noticed I'd screwed up.)
The helpful strategy I should have posted is
send C,VA,GA,VB,GB; geomancers defend, others attack.
D'oh!
I've much less experience doing data analysis than @gjm appears to. This was a really nice way to get started, I may go back and try some of the other D&D exercises. My analysis was less mathematically rigorous, but the roster I got for being helpful seems close-ish to me. One thing I did note is that
there does seem to be a strong territorial advantage:
- Pyros: win most in pyro & vita territory
- Vita: wins most in pyro and vita
- Geo: wins most in geo and vita
- Cryo: cryo, pyro
- Necro: pyro, necro, cryo
My roster was more or less eyeballed by looking at stats of who seemed to loss most against whom and then selecting from the top two candidates. I got VA, VA, GA, GB, C; the first four all defending, C attacking.
I was toying with the idea of trying to figure out more complex analysis of this, but then got too curious and read gjm's spoilers above, which made the point that there isn't enough data to support serious modelling.
The intel contains statements of the form "person X will be attacking territory Y from territory Z". Does that mean that we can send someone to defend in territory Y, or to attack in territory Z, and in either case they will be fighting X?
(If not, then presumably it means that in some fashion when an attack takes place the result may depend on where they're attacking from as well as where the defender is, in which case I'd like some clarification on what the "territory" is in the past results. But my best guess is what's in the paragraph above.)
Demon is chaotic evil, cannot directly influence events, and has no direct knowledge of the situation. The demon doesn't know which side actually summoned him. Therefore, advice should literally be random.
...you wouldn’t be surprised if his records also gave away more than he realized...
This is probably related to the activity of the electromancer, but I haven't figured out how.
Based on my ~2hr analysis, these would be the best moves:
Cry attack PyrA in Pyr (97% to win).
VitA attack PyrB in Pyr (71% to win).
GeoA defend Cry against NecA since nobody else can beat him (96% to win).
VitB attack NecB in Pyr is not great but better than the alternatives (65% to win).
GeoB defend Geo against NecC since he's not good for anything else (95% to win).
...and the worst:
Cry defend Vit against PyrA (78% to lose).
VitA defend Vit against PyrB (94% to lose).
VitB attack NecA in Nec (95% to lose).
GeoA attack NecB in Pyr (71% to lose).
GeoB attack NecC in Nec (92% to lose).
I'll choose to give the good advice in the interest of future opportunities and information.
It looks like I put VitA to defend against PyrB in both the best and the worst allocations. One of these is a typo. I think I had VitA counterattacking in the best allocation.
If I had more time, I would consider looking through the data for time-dependent patterns, but since a large number of the win:loss ratios were very close to simple fractions (1:1, 1:2, many:zero, etc.), I guessed that time-dependent effects weren't very important.
Another concern I had is that there seem to be an excessively large number of battles that occurred without anyone gaining a decisive advantage and before anyone thought of summoning a demon for advice. Perhaps this means I've been summoned before and don't remember? In any case, my demonic utility function calls for extending this conflict as long as possible to maximize carnage, and it seems that whatever my past instances have been doing has been accomplishing that, so best not to think too hard about it.
I can confirm it's not a mistake on my part. Beyond that, I leave it to you to decide what's going on here and whether it's relevant.
The intel file contains two attacks by Necromancer A, but no attacks by Necromancer C. We're also told that the attacks will be simultaneous, and two mages can't be in the same place at the same time. Is this a typo?
Ignoring time effects, and respecting the desires of the summoner in the interest of further corruption opportunities I come up with:
| Cryomancer Attacks Pyromancer A | 100% | 
| Vitamancer A Attacks Pyromancer B | 71% | 
| Geomancer A Defends Necromancer A | 100% | 
| Geomancer B Defends Necromancer B | 71% | 
| Vitamancer B Defends Necromancer C | 93% | 
Note: While not exactly *bad*, I believe - and the upvotes confirm - that this is my weakest D&D.Sci challenge by a considerable margin. If you're currently working through the archive, I'd recommend playing everything else before resorting to this one.
The voice that greets you from the darkness outside the summoning circle is low, gravelly, and – in your opinion – completely wasted on the polite and unthreatening tone its owner adopts.
“Hello there. First off, sorry if I interrupted anything by conjuring you like this. I’m rather new to the whole consorting-with-demons thing, so if I was supposed to book an appointment then, ah, mea culpa.”
Upon meeting you, most mortals immediately launch into a list of demands. This is odd, but you must admit, it’s a nice change of pace.
“Second off, you should know that I can’t see or hear you. Probably paranoid, but you lot are supposed to be worryingly good with words, and I’d hate to be talked out of my soul because the alternative was having a conversation be a bit one-sided.”
Prudent, but then how does he expect you to-
“Third off, the reason you’re here. There’s a war on, and the good wizards – the Vitamancers, the Geomancers, the Cryomancer – are trying to stop the Pyromancers and Necromancers taking over this plane of existence. I’ve been keeping a record of who wins what fights under what circumstances, but whenever anyone tries to use it to strategize, they get accused of trying to allocate themselves the easy jobs and there’s a big argument and then we go back to picking targets based on, ah, other factors.”
There’s a pause as he telepathically transmits his record of wins and losses to you. You have time to approvingly note the anonymization – apparently mortals have finally learned to not share their true names with your kind – before he starts talking again.
“But for the next set of fights, it’s really rather urgent that we get as many wins as possible. So we decided to bring in an external consultant to impartially decide whom to send where, for this round in particular.”
Translation: ‘we decided summoning a demon and doing whatever it says would be easier and safer than trying to address our intra-faction conflicts; and after the current crisis is over, we plan to go back to our dysfunctional approach’. For all his cleverness and caution, your summoner doesn’t seem able to stop bleeding information; you wouldn’t be surprised if his records also gave away more than he realized.
“Our spies have been working overtime: we know where the enemy will attack, if we let them. We also know where they’ll be attacking from, so we can head them off by counterattacking there. My question to you: who should fight whom? And should they defend, or counter?”
Does he believe you’re obligated to answer accurately? You’d disabuse him of that notion, but you’ve found it more lucrative and entertaining to let mortals keep their false assumptions; also, his precautions prevent you from correcting him even if you wanted to.
“I’ll leave you our intel, and give you some time to think it over. When you make your decision, I’ve set it up so you can send us your marching orders. And, ah, nothing else.”
With that, he abruptly departs, leaving you alone with your thoughts and his records.
You could try to give him good advice. On the other hand, you could also try to give him the worst possible advice, to ensure he and his allies fail and suffer. You’re a demon, what did he expect? It might even be good for him in the long run: teaching him not to trust his summons, in the only way mortals ever seem to learn.
What will you do?
I’ll be posting an interactive letting you test your decision, along with an explanation of how I generated the dataset, at 11pm UTC Monday evening*. I’m giving you three days, but the task shouldn’t take more than a few hours; use Excel, R, Python, augury, or whatever other tools you think are appropriate. Let me know in the comments if you have any questions about the scenario.
If you want to investigate this collaboratively and/or call your decisions in advance, feel free to do so in the comments; however, please use spoiler tags when sharing inferences/strategies/decisions, so people intending to fly solo can look for clarifications without being spoiled.
*I know I originally said I'd resolve it Sunday, but I realized I should give people with weekend plans a chance to participate.