I sincerely hope that if anyone has a concrete, actionable answer to this question, that they're smart enough not to share it publicly, for what I hope are obvious reasons.
But aside from that caveat, I think you are making several incorrect assumptions.
Re 1a: Intuitively what I mean by "lots of data" is something comparable in size to what ChatGPT was trained on (e.g. the Common Crawl, in the roughly 1 petabyte range); or rather, not just comparable in disk-space-usage, but in the number of distinct events to which reinforcement learning is applied. So when ChatGPT is being trained, each token (of which there are a ~quadrillion) is a chance to test the model's predictions and adjust the model accordingly. (Incidentally, the fact that humans are able to learn language with far less data input than this suggests that there's something fundamentally different in the way LLMs vs. humans work.)
Therefore, for a similarly-architected AI that generates action plans (rather than text tokens), we'd expect it to require a training set with a ~quadrillion different historical cases. Now I'm pretty sure this already exceeds the amount of "stuff happening" that has ever been documented in all of history.
I would change my opinion on this if it turns out that AI advancement is making it possible to achieve the same predictive accuracy / generative quality with ever less training data, in a way that doesn't seem to be levelling off soon. (Has wor...
Modern AI works by throwing lots of computing power at lots of data. An LLM gets good at generating text by ingesting an enormous corpus of human-written text. A chess AI doesn't have as big a corpus to work with, but it can generate simulated data through self-play, which works because the criterion for success ("Did we achieve checkmate?") is easy to evaluate without any deep preexisting understanding. But the same is not true if we're trying to build an AI with generalized agency, i.e. something that outputs strategies for achieving some real-world goal, which are actually effective when carried out. There is no massive corpus of such strategies that can be used as training data, nor is it possible to simulate one, since that would require either (a) doing real-world experiments (whereby generating sufficient data would be far too slow and costly, or simply impossible) or (b) a comprehensive world-model that is capable of predicting the results of proposed actions (which presupposes the thing whose feasibility is at issue in the first place). Therefore it seems unlikely that AIs built under the current paradigm (deep neural networks + big data + gradient descent) will ever achieve the kind of "superintelligent agency" depicted in the latter half of IABIED, which can devise effective strategies for wiping out humanity (or whatever).
By "real-world goal" I mean a goal whose search-space is not restricted to a certain well-defined and legible domain, but ranges over all possible actions, events, and counter-actions. Plans for achieving such goals are not amenable to simulation because you can't easily predict or evaluate the outcome of any proposed action. All of the extinction scenarios posited in IABIED are "games" of this kind. By contrast, a chess AI will never conceive of strategies like "Hire a TaskRabbit to surreptitiously drug your opponent so that they can't think straight during the game," and not for lack of intelligence, but because such strategies simply don't exist in the AI's training domain.
This was the main lingering question I had after reading IABIED.