First to superintelligence wins
This phrasing seems ambiguous between the claims "the first agent to BE superintelligent wins" and "the first agent to CREATE something superintelligent wins".
This distinction might be pretty important to your strategy.
Thoughts that occurred...
This reminds me of Social status part 1/2: negotiations over object-level preferences, particularly because of your comment that Japan might develop a standard of greater subtlety because they can predict each other better.
Among other points in the essay, they have a model of "pushiness" where people can be more direct/forceful in a negotiation (e.g. discussing where to eat) to try to take more control over the outcome, or more subtle/indirect to take less control.
They suggest that if two people are both trying to get more control they can end up escalating until they're shouting at each other, but that it's actually more common for two people to both be trying to get less control, because the reputational penalty for being too domineering is often bigger than whatever's at stake in the current negotiation, and so people try to be a little more accommodating than necessary, to be "on the safe side", and this results in people spiraling into indirection until they can no longer understand each other.
They suggested that more homogenized cultures can spiral farther into indirection because people understand each other better, while more diverse cultures are forced to stop sooner because they have more misunderstandings, and so e.g. the melting-pot USA ends up being more blunt than Japan.
They also suggested that "ask culture" and "guess culture" can be thought of as different expectations about what point on the blunt/subtle scale is "normal". The same words, spoken in ask culture, could be a bid for a small amount of control, but when spoken in guess culture, could be a bid for a large amount of control.
I'm quite glad to be reminded of that essay in this context, since it provides a competing explanation of how ask/guess culture can be thought of as different amounts of a single thing, rather than two fundamentally different things. I'll have to do some thinking about how these two models might complement or clash with each other, and how much I ought to believe each of them where they differ.
The paper actually includes a second experiment where they had observers watch a video recording of a conversation and say whether they thought the person on the video was flirting. Results in table 4, page 15; copied below, but there doesn't seem to be a way to format them as a table in a LessWrong comment:
Observer | Target | Flirting conditions | Accuracy (n)
Female Female Flirting 51% (187)
Female Female Non-flirting 67% (368)
Female Male Flirting 22% (170)
Female Male Non-flirting 64% (385)
Male Female Flirting 43% (76)
Male Female Non-flirting 68% (149)
Male Male Flirting 33% (64)
Male Male Non-flirting 62% (158)
Among third-party observers, females observing females had the highest accuracy, though their perception of flirting is still only 18 percentage points higher when flirting occurs than when it doesn't.
Third-party observers in all categories had a larger bias towards perceiving flirting than the people who were actually in the conversation. Though this experimental setup also had a larger percentage of people actually flirting, so this bias was actually reasonably accurate to the data they were shown.
Though, again, this study looks shoddy and should be taken with a lot of salt.
I'm confused by the study you cited. It seems to say that 14 females self-reported as flirting and that "18% (n = 2)" of their partners correctly believed they were flirting, but 2/14 = 14% and 3/14 = 21%. To get 18% of 14 would mean about 2.5 were right. Maybe someone said "I don't know" and that counted that as half-correct? If so, that wasn't mentioned in the procedure section.
It also says that 11 males self-reported as flirting, and lists accuracy as "36% (n = 5)", but 5/11 would be 45%; an accuracy of 36% corresponds to 4/11.
I don't think I trust this paper's numbers.
If we were to take the numbers at face value, though, the paper is effectively saying that female flirting is invisible. 18% correctly believed the girls were flirting when they were, but 17% believed they were flirting even when they weren't, and with only 14 girls flirting, 1% is a rounding error. So this is saying that actual female flirting has zero effect on whether her partner perceives her as flirting.
Agree that other players having tools, social connections, and intelligence in general all make it much harder to judge when you have the advantage. But I don't see how this answers the question of "why create underdog bias instead of just increasing the threshold required to attack?"
Strong disagree on the ancient world being zero-sum. A lion eating an antelope harms the antelope far more than it helps the lion. Thog murdering Mog to steal Mog's meal harms Mog far more than it helps Thog. I think very little in nature is zero-sum.
Seems weird to posit that evolution performed a hack to undermine an instinct that was, itself, evolved. If getting into conflicts that you think you can win is actually bad, why did that instinct evolve in the first place? And if it's not bad, why did evolution need to undermine it in such a general-purpose way?
I can imagine a story along the lines of "it's good to get into conflicts when you have a large advantage but not when you have a small advantage", but is that really so hard to program directly that it's better to deliberately screw up your model of advantage just so that the rule can be simplified to "attack when you have any advantage"? Accurate assessment seems pretty valuable, and evolution seems to have created behaviors much more complicated than "attack when you have a large advantage".
I agree that humans aren't very good at reasoning about how other players will react and how this should affect their own strategy, but I don't think that explains why they would have evolved one strategy that's not that vs another strategy that's not that.
(Also, I don't think Risk is a very good example of this. It's a zero-sum game, so it's mostly showing relative ability, not absolute ability. Also, the game is far removed from the ancestral environment and sending you a lot of fake signals (the strategies appropriate to the story the game is telling are mostly not appropriate to the abstract rules the game actually runs on), so it seems unsurprising to me that humans would tend to be bad at predicting behavior of other humans in this context. The rules are simple, but that's not the kind of simplicity that would make me expect humans-without-relevant-experience to make good predictions about how things will play out.)
A combination of the ideas in "binary search through spacetime" and "also look at your data":
If you know a previous time when the code worked, rather than starting your binary search at the halfway point between then and now, it is sometimes useful to begin by going ALL the way back to when it previously worked, and verifying that it does, in fact, work at that point.
This tests a couple of things:
If the bug still happens after you've restored to the "known working point", then you'll want to figure out why that is before continuing your binary search.
I don't always do this step. It depends how confident I am about when it worked, how confident I am in my restore process, and how mysterious the bug seems. Sometimes I skip this step initially, but then go back and do it if diagnosing the bug proves harder than expected.
Guess we're done, then.
Are not speculative arguments about reality normally shelved as nonfiction?