AI companies should be safety-testing the most capable versions of their models — LessWrong