x
Justice Sefas: Synthesizing Gatekeepers for Safe Reinforcement Learning — LessWrong