Methods of defense against AGI manipulation — LessWrong