How does the redundancy definition of abstractions account for numbers, e.g., the number three? It doesn’t seem like “threeness” is redundantly encoded in, for example, the three objects on the floor of my room (rug, sweater, bottle of water) as rotation is in the gear example, since you wouldn’t be able to uncover information about “three” from any one object in particular.
I could imagine some definition based on redundancy capturing “threeness” by looking at a bunch of sets containing three things. But I think the reason the abstraction “three” feels a little strange on this account is that it is both highly natural (math!) but also can be highly “arbitrary,” e.g., “threeness” is wherever a mind can count three distinct objects (and those objects can be maximally unrelated!).
Perhaps counting the three objects on the floor of my room is a non-natural use case of the abstraction “three,” but if so, why? And where is the natural abstraction “three” in the world?
I really value "realness" although I too am not sure what it is, exactly. Some thoughts:
I cannot stand fake wood or brick or anything fake really, because it feels like it is trying to trick me. It's "lying," in sort of the same way I feel like people lie when they say they are doing something because it helps climate change or whatever, when really it seems clear that they are doing it for social approval or something of that nature.
Moss feels very real to me, also, as do silky spider webs, or any slice of nature, really, when I'm in it. I think it's because the moss is not pretending to be something else, not to me anyways, it's just there.
Homes can be real-seeming to me, like how warm, cozy fireplaces with the wind whipping past the window and redwood walls make spaces seem inviting and true. But I think they can also be very not real. Many household things feel kind of "fake" to me, in the sense of trickery, like my microwave. It's not really deceiving me in the sense that it is lying about itself—it will heat up my food if I press some buttons, but it's like... asking something of me? Trying to get me to use it on its terms. Food containers with words on them also feel kind of like this... trying to get me to read them, to consume them, and so on...
"Trying for real" is an especially interesting one to me because it feels so important and I don't know quite what it is. At least part of it seems related to trickery, like how "actually trying" to answer a question looks like not giving up until you have a satisfying-to-your-curiosity answer and "not really trying" looks more like getting a good-enough-to-pass-another-person's-test answer, or not really believing it'll work, or something like that. Where the "actually trying" bit seems much more fundamentally related to the thing the trying is about, hence "real," the not-trying bit seems more related to something else entirely and that disconnect feels "fake" to me.
Thanks for writing this up! It seems very helpful to have open, thoughtful discussions about different strategies in this space.
Here is my summary of Anthropic’s plan, given what you’ve described (let me know if it seems off):
Leaving aside concerns about arms races and big models being scary in and of themselves, this seems like a pretty reasonable approach to me. In particular, I’m pretty on board with points 1, 2, and 3—i.e., if you don’t have theories, then getting your feet wet with the actual systems, observing them, experimenting, tinkering, and so on, seems like a pretty good way to eventually figure out what’s going on with the systems in a more formal/mechanistic way.
I think the part I have trouble with (which might stem from me just not knowing the relevant stuff) is point 4. Why do you need to do all of this on current models? I can see arguments for this, for instance, perhaps certain behaviors emerge in large models that aren’t present in smaller ones. But I’ve never seen, e.g., a list of such things and why they are important or cruxy enough to justify the emphasis on large models given the risks involved. I would really like to see such an argument! (Perhaps it does exist and I am not aware).
I also have a bit of trouble with the “top player” framing—at the moment I just don’t see why this is necessary. I understand that Anthropic works on large models, and that this is on par with what other “top players” in the field are doing. But why not just say that you want to work with large models? Why mention being competitive with Deepmind or OpenAI at all? The emphasis on “top player” makes me think that something is left unsaid about the motivation, aside from the emphasis on current systems. To the extent that this is true, I wish it were stated explicitly. (To be clear, "you" means Anthropic, not Miranda).
Ah, thanks! Link fixed now.
Yes, welp, I considered getting into this whole debate in the post but it seemed like too much of an aside. Basically, Lynch is like, “when you control for cell size, the amount of energy per genome is not predictive of whether it’s a prokaryote or a eukaryote.” In other words, on his account, the main determinant of bioenergetic availability appears to be the size of the cell, rather than anything energetically special about eukaryotes, such as mitochondria.
There are some issues here. First, most of the large prokaryotes are outliers like Thiomargarita, in the sense that they have expanded their energy without expanding their functional volume. However, their genomes are still quite small, which means that their “energy/genome” will be large. Eukaryotic cells of the same size have way more energy and way longer genomes, making their “energy/genome” roughly equivalent to the large prokaryotes.
Second, Lynch’s story is that strong selection keeps bacterial genomes short. The main reason that bacteria have strong selection is because there are so many of them, and there are so many of them because they’re so small. But why are they so small? It seems like an obvious contender is Lane’s story about them being energy bottlenecked by their surface area. So, in my opinion, these two hypotheses are synergistic and my best guess is that they’re both part of the story.
Thanks!!
Yeah I think it’s a great question and I don’t know that I have a great answer. Plasmids (small rings of DNA that float around separately) are part of the story. My understanding here is pretty sketchy, but I think plasmids are way more likely to be deleted than the chromosomal DNA, and for some reason antibiotic resistant genes tend to be in plasmids (perhaps because they are shared so frequently through horizontal gene transfer)? So the “delete within a few hours” bit is probably overstating the average case of DNA deletion in bacteria. I would be surprised if it “knew” about the function of the gene, although I agree it seems possible that some epigenetic mechanism could explain it. I don’t know of any, though!
Good question! I don’t know, but I think that they don’t necessarily need to. Something I didn’t get into in the post but which is pretty important for understanding bacterial genomes is that they do horizontal gene transfer, which basically means that they trade genes between individuals rather than exclusively between parents and offspring.
From what I understand, this means that although on average the bacteria shed the unhelpful DNA if given the opportunity, so long as a few individuals within the population still have the gene, it can get rapidly reacquired when needed. I don’t know exactly how the math works out, but I’d guess that in big enough populations, if antibiotic encounters are somewhat common, then probably they don’t need to do it de novo each time?
This also means bacterial genomes are much more distributed than eukaryotic ones. So long as any individual bacteria has some gene, it’s “as if” the whole species has it. Which means their genomes are, in a sense, actually longer than they might naively seem. Being distributed has advantages: no single genome needs to be very long, yet the population can hold onto useful stuff. But it also has disadvantages: any adaptation that relies on genes being close together in a single genome is unlikely to develop (which includes e.g. all of the regulatory hierarchy stuff mentioned in the post). So I do still expect that the pressure towards short genomes meaningfully stunts bacterial complexity.
This is an excellent post! Thank you for sharing your thoughts! I too am very curious about many of these questions, although I’m also at a half-baked stage with a lot of it (I’d also love to have a better footing here!). But in any case, here are some thoughts (in no particular order).
I love this work! It’s really cool to see interpretability on toy models in such a clear way.
The trend from memorization to generalization reminds me of the information bottleneck idea. I don’t know that much about it (read this Quanta article a while ago), but they appear to be making a similar claim about phase transitions. I believe this is the paper one would want to read to get a deeper understanding of it.
I like this framework, but I think it's still a bit tricky about how to draw lines around agents/optimization processes.
For instance, I can think of ways to make a rock interact with far away variables by e.g., coupling it to a human who presses various buttons based on the internal state or the rock. In this case, would you draw the boundary around both the rock and the human and say that that unit is "optimizing"?
That seems a bit weird, given that the human is clearly the "optimizer" in this scenario. And drawing a line around only the rock or only the human seems wrong too (human is clearly using the rock to do this strange optimization process and rock is relying on the human for this to occur). Curious about your thoughts.
Also, I'm not sure that agents always optimize things far away from themselves. Bacteria follow chemical gradients (and this feels agent-y to me), but the chemicals are immediately present both temporally and spatially. There is some sense in which bacteria are "trying" to get somewhere far away (the maximum concentration), but they're also pretty locally achieving the goal, i.e., the actions they take in the present are very close in space and time to what they're trying to achieve (eat the chemicals).
This reminds me a lot of one of Kuhn's essays A Function for Thought Experiments. Where basically he's like "people often conflate variables together; thought experiments can tease apart those conflations." E.g., kids will usually start out conflating height with volume so that even though they watch the experimenter pour the "same" amount of water into a taller, thinner glass, they will end up saying that the left hand glass in (c) has more water than the one on the right.

Which is generally a good heuristic: height of water line and volume are usually pretty correlated. Eventually, though, experience brings these two variables into tension and kids will update their models. Kuhn argues that thought experiments are often playing this role, i.e., calling attention to and resolving conceptual tension between variables that were previously conflated.
In any case, I think the strategy "considering more possibilities" is really important for figuring out the "edges" of concepts... it feels sort of like "playing" with them until you have a "feel" for what they are.... which seems related to me to your ideas about "indexing," too. Anyways, I thought a bunch of these examples were great. I now find myself confused about waves.