Thoughts About how RLHF and Related "Prosaic" Approaches Could be Used to Create Robustly Aligned AIs. — LessWrong