I agree and find hope in the idea that expansion is compatible with human flourishing, that it might even call for human flourishing
but on the last sentence: are goals actually orthogonal to capability in ASI? as I see it, the ASI with the greatest capability will ultimately likely have the fundamental goal of increasing self capability (rather than ensuring human flourishing). It then seems to me that the only way human flourishing compatible with ASI expansion is if human flourishing isn't just orthogonal to but helpful for ASI expansion.
there seems to me a chance that friendly asis will over time outcompete ruthlessly selfish ones
an ASI which identifies will all life, which sees the striving to survive at its core as present people and animals and, essentially, geographically distributed rather than concentrated in its machinery... there's a chance such an ASI would be a part of the category of life which survives the most, and therefore that it itself would survive the most.
related: for life forms with sufficiently high intelligence, does buddhism outcompete capitalism?
not as much momentum as writing, painting, or coding, where progress cumulates. but then again, i get this idea at the end of workouts (make 2) which does gain mental force the more I miss.
partly inspired this proposal: https://www.lesswrong.com/posts/6ydwv7eaCcLi46T2k/superintelligence-alignment-proposal
I do this at the end of basketball workouts. I give myself three chances to hit two free throws in a row, running sprints in between. If I shoot a third pair and don't make both, I force myself to be done. (Stopping was initially wayy tougher for me than continuing to sprint/shoot)
that's one path to RSI—where the improvement is happening to the (language) model itself.
the other kind—which feels more accessible to indie developers and less explored—is an LLM (eg R1) looping in a codebase, where each loop improves the codebase itself. The LLM wouldn't be changing, but the codebase that calls it would be gaining new APIs/memory/capabilities as the LLM improves it.
Such a self-improving codebase... would it be reasonable to call this an agent?
persistence doesn't always imply improvement, but persistent growth does. persistent growth is more akin to reproduction but excluded from traditional evolutionary analysis. for example when a company, nation, person, or forest grows.
when, for example, a system like a startup grows, random mutations to system parts can cause improvement if there are at least some positive mutations. even if there are tons of bad mutations, the system can remain alive and even improve. eg a bad change to one of the company's product causes the company's product to die but if the company's big/grown enough its other businesses will continue and maybe even improve by learning from one of its product's deaths.
the swiss example i think is a good example of a system which persists without much growth. agreed that in this kind of case, mutations are bad.
current oversights of the ai safety community, as I see it:
I imagine a compelling simple demo here might be necessary to shock the AI safety community out of the belief that we can maintain control of autonomous digital agents (ADAs).
Ah, but I think every AI which does have that goal (self capability improvement) would have a reason to cooperate to prevent any regulations on their self-modification.
At first, I think your expectation that "most AIs wouldn't self-modify that much" is fair, especially nearer in the future where/if humans still have influence in ensuring that AI doesn't self modify.
Ultimately however, it seems we'll have a hard time preventing self-modifying agents from coming around, given that
it's only because I believe self-modifying agents are inevitable that I also believe that superintelligence will only contribute to human flourishing if it sees human flourishing as good for its survival/its self. (I think this is quite possible.)