Research Log, RLLMv2: Phi-1.5, GPT2XL and Falcon-RW-1B as paperclip maximizers — LessWrong