It would not be stable.
That is besides my point. I think you can make it stable, but anyway.
up until the most vicious actor creates their preferred world in the whole light cone - which might well also involve lots of suffering
There are some reasons to think default trajectory, of pragmatic victor, just get this evolution-created world duplicated many more times. Might be the baseline you have to improve on. Torture clusters might be worse outcome borne of uhh large but not quite enough ability to steer the values of the agent(s) that dominate.
distribution of rapidly and unevenly expanding unregulated power does not contain a stable equilibrium
It might be stable? The question is, would it be a good one.
Mind Crime might flip the sign of the future.
If future contains high tech, underregulated compute and diverse individuals, then it's likely it will contain incredibly high amounts of most horrendous torture / suffering / despair / abuse.
It's entirely possible you can have 10^15 human-years per second of real time on a small local compute cluster. If such amounts of compute can be freely available to individuals no questions asked, then it's probable some of them will decide to use them to explore undesirable mental states.
*slaps roof of a server rig* This bad boy can sample, evaluate and discard as many chickens as they did in their whole history in just two minutes. In fact it's doing it right now. And I have another 200 of them. Why? Uhhhhh, chicken backed crypto of course.
For context, you can estimate that over all history so far there were around 10^12 chicken-hours. It's such a small number if you have any of the advanced compute substrate.
Considerations like this heavily depend on how do you view it, more like a ratio to all the experiences or as an absolute priority over good states.
This consideration might just straight up overwhelm your prioritization of today not-very-scalable sufferings. And it's not very longtermist worry, this would start to be a major consideration this century and probably in the next 20 years.
Libertarian proposals like https://www.transhumanaxiology.com/p/the-elysium-proposal have such a flaw, they contain vast amounts of the worst sufferings. "Hell on my property is none of your business!" It's pretty bleak tbh
EDIT "10^15 human-years per second of real time" is unlikely. Given football field of solar panels, you probably can do at most 10^-3 human-years per second of real time, so that 18 OOMs up from that would probably look like a substantial investment of energy, noticeable on the scale of the solar system.
I'm not sure what is your point here.
Also note there is Axiology (what things are good/bad?), Morality (what should you do?) and Law (what rules should be made / enforced?). It makes sense to try to figure out what is good, and to try to figure out what should you do, and what institution building activities are necessary.
I think it makes sense to work on these questions, they matter to me and so I see value in someone burning their FLOPs to help me and other people get easy to verify deductions. I also agree that current quality of such work is not that great (including yours).
You probably also have to estimate some sort of "no human take off" counterfactual extinction rate and compare them to get the upper bound.
Possible setup to test steganography capabilities:
Pick two tasks which model is incapable of doing without CoT. Show two task statements, require it to solve both problems with a single CoT and give two distinct answers to both tasks.
Then show the judge (a human or a model) only one pair question -> shared CoT -> answer, and require the judge to guess what was the other task.
You can do some variations on that, train the model to be better at this, train both models adversarially.
Or you can make some portion of tests to be just the one problem to solve, and part two problems. The judge has to guess was there another problem and if so what problem.
Except 1000 nm lasers pointed at the sky, they dump around half of the energy they consume into the space.
Yeah, especially within the framing that upweights the behavioral proxies by such a huge margin. And they also have more neurons, (like under million in a bee, around a trillion in frontier models), although pretty different ones.
But it would be better if you did. And more productive. And admirable.
You just have to clearly draw the distinction between "not X" claim and "Y" claim in your writing.
By "stable" I meant "able to exist at all", as opposed to be conquered / merged into a singleton or something similar. And I didn't make a claim about the extent it's likely, more about how desirable it is. And what (value based, not existential/pragmatic) problems you would have to deal with in such a state.
I don't have a detailed plan / expectation on how exactly you could be working to achieve such a state. It just seems vaguely possible, I don't think I can offer you any new insight on that.