These are very preliminary notes, to get the rough ideas out. There's lots of research lying around, a paper in the works, and I'm happy to answer any and all questions.
The Northstar of AI Alignment, as well as Alignment at Large, should be Superwisdom and Moral RSI (Recursive Self-Improvement). Our current notion of human values is too shallow, too static, too corrupted.
Coherently Extrapolated Volition was directionally right — a method for continually extrapolating what we’d want to want if we were wiser, had grown up further, etc. However, this requires a non-arbitrary concept of wisdom and moral progress. I believe a developmentally informed Moral Realism can serve as the foundation for this:
It’s... (read 212 more words →)
Thanks! Yes, there's lots of convergence between methods, something Joe Carlsmith also wrote about. What do you see as the strongest arguments against Moral Realism?