A few responses:
Wanted to revisit this because it seemed like one of the points where people most strongly disagreed with me. I'm trying to figure out a crux here. One might be something like: how widespread were celebrations of the 10/7 attacks amongst prominent leftists (and especially the student groups that later organized encampments)? I could imagine updating that there were only a handful of cases that disproportionately blew up, which would make me take back or at least caveat the "supports Hamas" thing.
If you on the other hand found that there were many cases where prominent leftists and encampment organizers actively celebrated the 10/7 attacks, would you (@dirk or @lc) then agree that "supports Hamas" is a reasonable summary?
That's a fair point; however, I don't think it undermines my overall claims very much. I think the lack of rule of law for black Americans was bad in a comparable way to how the lack of rule of law for various European colonies was bad. That is, while it was bad for the people who didn't get rule of law, they were a separate enough category that this mostly didn't "leak into" undermining the legal mechanisms that helped their societies become productive and functional in the first place.
This seems very relevant, thank you. Will check it out.
That is very cool, thank you! Will check it out.
A next step is to settle on a model of what you want to get done, and what capabilities the adversaries have.
Perhaps. The issue here is that I'm not so interested in any specific goal, but rather in facilitating emergent complexity. One analogy here is designing Conway's game of life: I expect that it wasn't a process of "pick the rules you want, then see what results from those" but also in part "pick what results you want, and then see what rules lead to that".
Re the Byzantine generals problem, see my reply to niplav below:
I believe (please correct me if I'm wrong) that Byzantine fault tolerance mostly thinks about cases where the nodes give separate outputs—e.g. in the Byzantine generals problem, the "output" of each node is whether it attacks or retreats. But I'm interested in cases where the nodes need to end up producing a "synthesis" output—i.e. there's a single output channel under joint control.
Yeah, "incomplete" is the wrong word here. I edited shortly after posting to say instead "there are many elements of good chess strategy for bounded agents that they can’t account for".
In a post on Solomonoff Induction (and also in this wiki entry), Yudkowsky describes Shannon’s minimax algorithm for searching the entire chess game-tree as an example of conceptual progress. Previously, Edgar Allen Poe had argued that it was impossible in principle for a machine to play chess well. With Shannon’s algorithm, it became possible in principle, just computationally infeasible.
However, even “principled” algorithms like minimax search don’t take into account the possibility that your opponent knows things you don’t know (as I’ve previously discussed here). And so there are many elements of good chess strategy for bounded agents that they can’t account for:
Some of these considerations have been discussed by Yudkowsky under the heading of "Vingean uncertainty” (though as per my post linked above I’ll talk about the more general concept of Knightian uncertainty). And I’m sure that they’ve been acknowledged elsewhere as well—though I suspect that their importance is underrated because we don’t have good technical language for describing them. To build relevant intuitions, try playing a game against LeelaPieceOdds, a chess engine designed to beat humans even when it starts down a piece (or several). Playing it is very frustrating because you constantly have the sense that you should be winning, but you don’t know how to actually make use of your advantage. I’m interested in figuring out a theory of how to do so, as a step towards building a more general theory of how to behave under Knightian uncertainty.
Here’s one approach. For each position, estimate the value of that position conditional on reaching it in your game (which I’ll call the Knightian value of the position). That is, suppose you’re playing white, and you’re evaluating a possible move W1 and a possible response B1. Then your evaluation of the position after B1 should not only incorporate your “naive” estimate of how good the position is, but also account for the fact that your opponent has deliberately chosen to play B1 knowing that it’d lead to that state.
According to Claude, this is analogous to a simple version of the Glosten-Milgrom model of bid-ask spreads on stock markets: Ask = E[Value | Buy order arrives], and Bid = E[Value | Sell order arrives]. How can you calculate those values? I’m not sure how financiers actually do it, but my immediate intuition is that you’d need a model of the opportunity cost of not investing in other things, and a model of how well-informed and “rational” other traders are, and then you’d need to somehow combine them to get an overall update.
The chess case is simpler in the sense that there’s only one adversary. But unlike in finance, there’s no “market rate” of opportunity cost. Instead, the Knightian value of every move will depend sensitively on how good the alternative moves available were. E.g. if I play a seemingly-pointless queen sacrifice when I have many other good moves available, then you should strongly suspect that it’s a trap. Whereas if I do the same in a position that is otherwise lost, you shouldn’t update nearly as much.
Unfortunately, this makes things much more complicated. In order to estimate the “naive” value of a move, we need to look at all the moves downstream of it. But estimating the Knightian value also requires evaluating the naive value of all the moves upstream of it. Basically, every Knightian estimate depends on every other part of the game-tree. Now, we might still be able to find an algorithm which does this given enough compute—just like Shannon took chess from unsolved in principle to solved given enormously implausible amounts of compute. But this seems like it would defeat the point of updating on our opponent’s intentions at all—because knowing the entire game-tree should “screen off” the relevance of their intentions. (This feels kinda analogous to how fully understanding physics, biology, sociology, etc should screen off anthropic reasoning about how many alien civilizations exist.)
—
Here’s a more promising approach. There are some positions where my estimates of their value won’t change much conditional on the game reaching them. If I can see a clear mate in 1, then it’s far more likely that my opponent made a bunch of mistakes to get to this point, than that they have some clever plan I’m not seeing (though sometimes I think I have a mate in 1 and don’t actually).
Intuitively speaking, the most sensible strategy seems to be to “bootstrap” from parts of the game-tree I can evaluate robustly, to then come up with more robust estimates of other parts of the game-tree. In order to figure out how to do that, we’ll need a more precise definition of robustness, and also some way of tying it to my estimate of my opponent’s skill level (though perhaps we could even define skill in terms of ability to win from positions that I’d thought were robustly losing? Worth thinking about.)
I don’t currently know how to do this, but I want to gesture at three ideas that might be relevant. Firstly, there’s work on imprecise probabilities—for example replacing point estimates (like “55% chance of winning from this position”) with “credal sets” (like “45-65% chance of winning from this position”). You can then replace the credal set in turn with a “plausibility function” that describes how plausible each point estimate is (with estimates outside the credal set having 0 plausibility). I suspect that there’s some relationship between plausibility and robustness in the sense I defined above—though I’m not very familiar with the literature, and so I’m mainly going off some conversations I had with davidad.
Secondly, Peter Schmidt-Nelson is trying to use symmetry considerations to demonstrate that there’s no winning strategy for black in chess. I don’t think the project itself is directly relevant to the ideas I’ve been describing here, but it feels related in a philosophical sense.
Thirdly, although I’ve been talking about the “value” of a position as if it’s a well-defined concept, it mostly isn’t. Stockfish’s value calculations are grounded in the likelihood of it winning from that position when playing itself. But there’s no clear way to translate from that to the likelihood of winning against one’s actual opponent, which is what we’re interested in. I won’t discuss this further here, but trying to pin down how to estimate a position’s value in that sense seems potentially fruitful.
—
To finish, I want to highlight an interesting analogy. Personally speaking, I’m much more interested in playing Go than in playing chess. And because Go has a many many more possible board positions than chess, the considerations above are much more salient. But more than that, they seem to play out on two different levels: both within the game-tree, and on the board itself.
Specifically, when you’re playing Go, the board typically ends up divided into territories which are loosely or securely controlled by one player or the other player. When a territory is loosely controlled, it’s often possible to “invade” it and prevent the other player from scoring points within it. The more securely a territory is controlled, however, the less likely an invasion is to succeed (and the more the invading player will weaken their other nearby territories even if they do succeed).
When a territory is small, you can try to decide whether it can be invaded by evaluating many possible sequences of moves. However, for larger territories this is prohibitively difficult. Instead, you need to use your intuition to evaluate how secure the territory is overall, perhaps by evaluating a couple of the scariest possible invasions. It’s also possible to make “probing moves” which your opponents should respond to differently depending on how secure they think their territory is.
I hypothesize that a sufficiently good theory of how to search through the chess game-tree will also tell us about, not just how to search through the Go game-tree, but also what strategies to use on the Go board itself. It’ll need to incorporate concepts like “this region of the board/game-tree is robustly my territory” and “this region is high in expected value but not very robust” and “here’s the boundary between one region and another”. The interactions between different regions of the board are more complex and nuanced than between different regions of the game-tree, but some of the same principles will likely still apply. E.g. since players have limited time, building trust in one region of the game-tree trades off against building trust in another (just as defending one region of the Go board trades off against defending another). To exploit that, you can make “probing moves” which force them to decide if they actually trust their estimate of a sharp line of play, or if they’ll retreat to a line which seems worse but more solid for them. And so on.
This all feels like a set of hints towards a potential way of quantifying Knightian uncertainty in terms of boundaries and how permeable they are to adversaries—which might then be extended to much larger-scale “games” like internal conflicts within minds or even human geopolitical conflicts. One piece of evidence that this agenda is paying off would be if we could create a “good old-fashioned AI” Go engine which can play at a superhuman level without using any neural networks—though I expect that it’s worth trying to build a significantly better theoretical understanding before embarking on that project. (Edit: maybe worth starting off with something like Connect 4 for the sake of simplicity.)
Ty, fixed.
I really like the cellular automaton model. But I don't think it makes escaping corruption easy! Even if most of the copies are non-corrupt, the question is how you can take a "vote" of the corrupt vs non-corrupt copies without making the voting mechanism itself be easily corrupted. That's why I was talking about the non-corrupt copies needing to "overpower" the corrupt copies above.