I like this observation!
To make the pair of strategies slightly more concrete, we could say that Bot A always picks LeelaPieceOdds's top choice, out of moves that don't hurt the game-theoretic status, and Bot B always picks its bottom choice out of those moves.
It would be fun to watch Bot B play a novice human, followed by a grandmaster. In both cases it might look a lot like a bored genius playing a newbie -- hanging some pieces to even the odds, then playing competently with what's left.
I think the "all demons are fallen humans" interpretation is supported by one of Jinu's lines: "That’s all demons do. Feel. Feel our shame, our misery. It’s how Gwi-Ma controls us"
Good point! I should have said something like "turning the wrong way with enough energy that, if its momentum were suddenly reversed, it would be expected to reach the next notch."
My impression is that the averaged null energy condition (violated by negative mass) is much more widely accepted than the strong energy condition (violated by dark energy), but yeah, that's a good point. (Also, I don't really know why people think the ANEC or something similar has to be true.)
I think you unfortunately can't really verify the recent epistemic health of theoretical physics, without knowing much theoretical physics, by tracing theorems back to axioms. This is impossible to do even in math (can I, as a relative layperson, formalize and check the recent Langlands Program breakthrough in LEAN?) and physics is even less based on axioms than math is.
("Even less" bc even math is not really based on mutually-agreed-upon axioms in a naive sense, cf. Proofs and Refutations or the endless squabbling over foundations.)
Possibly you can't externally verify the epistemic health of theoretical physics at all, post-70s, given the "out of low hanging empirical fruit" issue and the level of prerequisites needed to remotely begin to learn anything beyond QFT.
Speaking as a (former) theoretical physicist: trust us. We know what we're talking about ;)
I'm not Mitchell, but I think I agree with him here enough to guess: He probably means to say that production of new plausible theories has increased, production of experimentally verified theories has stalled, and the latter is not string theory's fault.
(And of course this whole discussion, including your question, is interpreting "physics" to means "fundamental physics", since theoretical and empirical work on e.g. condensed matter physics has been doing just fine.)
I am not going to spend more than a few minutes here or there to give "speaking as a physicist" takes on random LW posts; I think convincing people that my views are correct in full detail would require teaching them the same things that convinced me of those views, which includes e.g. multiple years of study of QFT.
Instead, I tend to summarize what I think and invite people to ask specific questions about e.g. "why do you believe X" if they want to go further down the tree or test my beliefs more aggressively.
"That doesn't answer the question because I am not convinced by everything you said" is not really a helpful way to do that imo.
To spell out my views: there has been a bit of a real slow-down in theoretical physics, because exploring the tree of possible theories without experiment as a pruning mechanism is slower than if you do get to prune. I think the theory slowdown also looks worse to outsiders than it is, because the ongoing progress that does happen is also harder to explain due to increasing mathematical sophistication and a lack of experimental correlates to point to. This makes e.g. string theory very hard to defend to laypeople without saying "sorry, go learn the theory first".
This is downstream of a more severe slowdown in unexplained empirical results, which results from (imo) pretty general considerations of precision and energy scales, per the modern understanding of renormalization, which suggest that "low-hanging fruit gets picked and it becomes extremely expensive to find new fruit" is a priori pretty much how you should expect experimental physics to work. And indeed this seems to have happened in the mid 20th century, when lots of money got spent on experimental physics and the remaining fruit now hangs very high indeed.
And then there's the 90s/2000s LHC supersymmetry hype problem, which is a whole nother (related) story.
This post makes the excellent point that the paradigm that motivated SAEs -- the superposition hypothesis -- is incompatible with widely-known and easily demonstrated properties of SAE features (and feature vectors in general). The superposition hypothesis assumes that feature vectors have nonzero cosine similarity only because there isn't enough space for them all to be orthogonal, in which case the cosine similarities themselves shouldn't be meaningful. But in fact, cosine similarities between feature vectors have rich semantic content, as shown by circular embeddings (in several contexts) and feature splitting / dimensionality-reduction visualizations. Features aren't just crammed together arbitrarily; they're grouped with similar features.
I didn't properly appreciate this point before reading this post (actually: before someone summarized the post to me verbally), at which point it became blindingly obvious.
There are some earlier blog posts that point out that superposition is probably only part of the story, e.g. https://transformer-circuits.pub/2023/superposition-composition/index.html on compositionality, but this one presents the relevant empirical evidence and its implications very clearly.
This post holds up pretty well: SAEs are still popular (although they've lost some followers in the last ~year), and the point isn't specific to SAEs anyway (circular features embeddings are ubiquitous). Superposition is also still an important idea, although I've been thinking about it less so I'm not sure what the state of the art is.
My only complaint is that "maybe if I'm being more sophisticated, I can specify the correlations between features" is giving the entire game away -- the full set of correlations is nearly equivalent to the embeddings themselves, and has all of the interesting parts.
But I think the rest of the post demonstrates an important tension between theory and experiment, which an improved theory has to be able to account for, and I don't think I've heard of an improved theory yet.