[ Question ]

Does any of AI alignment theory also apply to supercharged humans?

by Samuel Shadrach1 min read7th Oct 20212 comments



By supercharged human I mean a human who can access more compute power or memory via a brain-computer-interface, or a human that is capable of doing neurosurgery on oneself to edit memory, reasoning processeses or goal content.


One problem with feeding human goals to an AGI is that it will notice the inherent contradictions in goal content and delete the portions that are uninteresting or strictly dominated by other portions. Won't humans also do same thing to themselves if they were given means to become smarter? Delete some our own goals, using neurosurgery if absolutely necessary.


So maybe if we ourselves became more smarter, we'd also become more narrow-minded with more consistent goals - which would then be a lot more amenable to being fed into an AI. And then there wouldn't be an alignment problem. So the alignment problem might not be because AI is smart but because humans are stupid (or atleast stupid in the sense that we are unable to notice or meaningful resolve contradictions in our own goal content).


Is there any existing reading material along these lines?

New Answer
Ask Related Question
New Comment
2 comments, sorted by Highlighting new comments since Today at 11:29 AM

I think it already applies to ordinarily highly charged humans. Consider the Great Dictators of the 20th century. Consider also "Reason as memetic immune disorder", vs. "Taking Ideas Seriously". Humans are dangerous General Intelligences.

Thank you for this! Will read.