Introduction
The rapid advancements in artificial intelligence (AI) have led to growing concerns about the threat that an artificial superintelligence (ASI) would pose to humanity.
This post explores the idea that the best way to ensure human safety is to have a single objective function for all ASIs, and proposes a scaffolding for such a function.
Disclaimer
I’m not a professional AI safety researcher. I have done a precursory search and have not seen this framework presented elsewhere, but that doesn’t mean it’s novel. Please let me know if this this ground has been covered so I can give credit.
The ideas presented here could also be entirely wrong. Even if they hold merit, numerous open questions... (read 2711 more words →)
>>If things happen the right way, we will get a lot of freedom as a consequence of that. But starting with freedom has various problems of type "my freedom to make future X is incompatible with your freedom to make it non-X".
Yes, I would anticipate a lot of incompatibilities. But the ASI would be incentivized to find ways to optimize for both people's freedom in that scenario. Maybe each person gets 70% of their values fulfilled instead of 100%. But over time, with new creativity and new capabilities, the ASI would be able to nudge that to 75%, and then 80% and so on. It's an endless optimization exercise.
>>Second reason is safety.... (read more)