I Tested LLM Agents on Simple Safety Rules. They Failed in Surprising and Informative Ways. — LessWrong