Thanks, I did get the PM.
There's a lot worth saying on these topics, I'll give it a go.
Also, our intuitions about the extent to which nobody has a good idea of how to make TAI might differ too.
To be clear I'm not saying nobody has a good idea of how to make TAI. I expect pretty short timelines, because I expect the remaining fundamental challenges aren't very big.
What I don't expect is that the remaining fundamental challenges go away through small-N search over large architectures, if the special sauce does turn out to be significant.
Well I understand now where you get the 17, but I don't understand why you want to spread it uniformly across the orders of magnitude. Shouldn't you put the all probability mass for the brute-force evolution approach on some gaussian around where we'd expect that to land, and only have probability elsewhere to account for competing hypotheses? Like I think it's fair to say the probability of a ground-up evolutionary approach only using 10-100 agents is way closer to zero than to 4%.
I'm still not following the argument. [...] So e.g. when you have 3 OOMs more compute than the HBHL milestone
I think you're mixing up my paragraphs. I was referring here to cases where you're trying to substitute searching over programs for the AI special sauce.
If you're in the position where searching 1000 HBHL hypotheses finds TAI, then the implicit assumption is that model scaling has already substituted for the majority of AI special sauce, and the remaining search is just an enabler for figuring out the few remaining details. That or that there wasn't much special sauce in the first place.
To maybe make my framing a bit more transparent, consider the example of a company trying to build useful, self-replicating nanoscale robots using a atomically precise 3D printer under the conditions where 1) nobody there has a good idea of how to go about doing this, and 2) you have 1000 tries.
It takes us about 17 orders of magnitude away from the HBHL anchor, in fact. Which is not very far, when you think about it. Divide 100 percentage points of probability mass evenly across those 17 orders of magnitude, and you get almost 6% per OOM, which means something like 4x as much probability mass on the HBHL anchor than Ajeya puts on it in her report!
I don't understand what you're doing here. Why 17 orders of magnitude, and why would I split 100% across each order?
I don't follow this argument. It sounds like double-counting to me
Read ‘and therefore’, not ‘and in addition’. The point is that the more you spend your compute on search, the less directly your search can exploit computationally expensive models.
Put another way, if you have HBHL compute but spend nine orders of magnitude on search, then the per-model compute is much less than HBHL, so the reasons to argue for HBHL don't apply to it. Equivalently, if your per-model compute estimate is HBHL, then the HBHL metric is only relevant for timelines if search is fairly limited.
I'm not sure I get the distinction between enabler and substitute, or why it is relevant here. The point is that we can use compute to search for the missing special sauce. Maybe humans are still in the loop; sure.
Motors are an enabler in the context of flight research because they let you build and test designs, learn what issues to solve, build better physical models, and verify good ideas.
Motors are a substitute in the context of flight research because a better motor means more, easier, and less optimal solutions become viable.
Eventually the conclusion holds trivially, sure, but that takes us very far from the HBHL anchor. Most evolutionary algorithms we do today are very constrained in what programs they can generate, and are run over small models for a small number of iteration steps. A more general search would be exponentially slower, and even more disconnected from current ML. If you expect that sort of research to be pulling a lot of weight, you probably shouldn't expect the result to look like large connectionist models trained on lots of data, and you lose most of the argument for anchoring to HBHL.
A more standard framing is that ‘we can do trial-and-error on our AI designs’, but there we're again in a regime where scale is an enabler for research, moreso than a substitute for it. Architecture search will still fine-tune and validate these ideas, but is less likely to drive them directly in a significant way.
Short-term economic value: How lucrative will pre-AGI systems be, and how lucrative will investors expect they might be? What size investments do we expect?
Societal robustness: How robust is society to optimization pressure in general? In the absence of recursive improvement, how much value could a mildly superintelligent agent extract from society?
What is the activation energy for an Intelligence Explosion?: What AI capabilities are needed specifically for meaningful recursive self-improvement? Are we likely to hit a single intelligence explosion once that barrier is reached, or will earlier AI systems also produce incomplete explosions, eg. if very lopsided AI can recursively optimize some aspects of cognition, but not enough for generality?
Personability vs Abstractness: How much will the first powerful AI systems take on the traits of humans, versus to what extent will they be idealized, unbiased reasoning algorithms.
If the missing pieces of intelligence come from scaling up ML models trained on human data, we might expect a bias towards humanlike cognition, whereas if the missing pieces of intelligence come from key algorithmic insights, we might expect fewer parallels.