I've had exactly the same thought before but never got around to writing it up. Thanks for doing it so I don't have to :-)
There are only so many possible human shaped computations that are valuable to me
I would surmise that value-space is not so much "finite in size" but rather that it fades off into the distance in such a way that it has a finite sum over the infinite space. This is because other minds are valuable to me insofar as they can do superrationality/FDT/etc. with me. In fact, this is the same fading-out function as in the "perturb the simulation" scenario; i.e.:
However, the main problem with this perspective is what to do with quantum many-worlds. Does this imply that "quantum suicide" is rational, e.g. that you should buy a lottery ticket and set up a machine that kills you if you don't win? This is bullet I don't want to bite (so to speak...)
Re 1a: Intuitively what I mean by "lots of data" is something comparable in size to what ChatGPT was trained on (e.g. the Common Crawl, in the roughly 1 petabyte range); or rather, not just comparable in disk-space-usage, but in the number of distinct events to which reinforcement learning is applied. So when ChatGPT is being trained, each token (of which there are a ~quadrillion) is a chance to test the model's predictions and adjust the model accordingly. (Incidentally, the fact that humans are able to learn language with far less data input than this suggests that there's something fundamentally different in the way LLMs vs. humans work.)
Therefore, for a similarly-architected AI that generates action plans (rather than text tokens), we'd expect it to require a training set with a ~quadrillion different historical cases. Now I'm pretty sure this already exceeds the amount of "stuff happening" that has ever been documented in all of history.
I would change my opinion on this if it turns out that AI advancement is making it possible to achieve the same predictive accuracy / generative quality with ever less training data, in a way that doesn't seem to be levelling off soon. (Has work been done on this?)
Re 2a: Accordingly, the reference class for the "experiments" that need to be performed here is not like "growing cells in a petri dish overnight", but rather more like "run a company according to this business plan for 3 months and see how much money you make." And at the end you'll get one data point - just 999,999,999,999,999 to go...
Re 2b:
What do you see as the upper bound for what can, in principle, be done with a plan that an army of IQ-180 humans (aka no better qualitative thinking than what the smartest humans can do, so that this is a strict lower bound on ASI capabilities) came up with over subjective millennia with access to all recorded information that currently exists in the world?
I'll grant that such an army could do some pretty impressive stuff. But this is already presupposing the existence of the superintelligence whose feasibility we are trying to explain.
Re 3c/d:
I haven't looked into this or tried it myself. (I mostly use LLMs for informational purpose, not for planning actions). Do you have any examples handy of AI being successful at real-world goals?
(I may add some thoughts on your other points later, but I didn't want to hold up my reply on that account.)
Stepping back, I should reiterate that I'm talking about "the current AI paradigm", i.e. "deep neural networks + big data + gradient descent", and not the capabilities of any hypothetical superintelligent AI that may exist in the future. Maybe this is conceding too much, inasmuch as addressing just one specific kind of architecture doesn't do much to alleviate fear of doom by other means. But IABIED leans heavily on this paradigm in making its case for concern:
skiing
It seems like skiing is a "hereditary" class marker because it's hard to learn how to do it as an adult, and you're probably not going to take your kids skiing unless you yourself were taught as a kid, etc.
trying to imagine being something with half as much consciousness
Isn't this what we experience every day when we go to sleep or wake up? We know it must be a gradual transition, not a sudden on/off switch, because sleep is not experienced as a mere time-skip - when you wake up, you are aware that you were recently asleep, and not confused how it's suddenly the next day. (Or at least, I don't get the time-skip experience unless I'm very tired.)
(When I had my wisdom teeth extracted under laughing gas, it really did feel like all-or-nothing, because once I reawoke I asked if they were going to get started with the surgery soon, and I had to be told "Actually it's finished already". This is not how I normally experience waking up every morning.)
I think this approach wouldn't work for rationalists, for two reasons:
Can't speak for Said Achmiz, but I guess for me the main stumbling block is the unreality of the hypothetical, which you acknowledge in the section "This is not a literal description of reality" but don't go into further. How is it possible for me to imagine what "I" would want in a world where by construction "I" don't exist? Created Already in Motion and No Universally Compelling Arguments are gesturing at a similar problem, that there is no "ideal mind of perfect emptiness" whose reasoning can be separated from its contingent properties. Now, I don't go that far - I'll grant at least that logic and mathematics are universally true even if some particular person doesn't accept them. But the veil-of-ignorance scenario is specifically inquiring into subjectivity (preferences and values), and so it doesn't seem coherent to do so while at the same time imagining a world without the contingent properties that constitute that subjectivity.
I think ancient DNA analysis is the space to watch here. We've all heard about Neanderthal intermixing by now, but it's only recently become possible to determine e.g. that two skeletons found in the same grave were 2nd cousins on their father's side, or whatever. It seems like this can tell us a lot about social behavior that would otherwise be obscure.
It took me years of going to bars and clubs and thinking the same thoughts:
before I finally realized - the whole draw of places like this is specifically that you don't talk.
The OP isn't making any claim like this; the question isn't whether any particular experience has value in-and-of-itself, but is only making a claim about the correct way to evaluate the total utility in a world with multiple experience.
By analogy, consider special relativity: If a train is moving at 0.75c relative to the ground, and a passenger on the train throws a ball forward at 0.5c, then that means the ball is moving at 0.91c relative to the ground. But there is no reference frame in which the ball is "really" moving at 0.16c.
Or, more pertinently, suppose we have two identical simulations playing out at the same time. Each one contributes zero marginal utility to a world in which the other one exists (and might be told this by Omega), but that doesn't mean that the two of them together have zero utility.