x
The Path Toward Robust Neural-Mediated Alignment — LessWrong