Some alignment ideas
Epistemic status: untested ideas Introduction This post is going to describe a few ideas I have regarding AI alignment. It is going to be bad, but I think there are a few nuggets worth exploring. The ideas described in this post apply to any AI trained using a reward function...
Aug 10, 20231