On excluding dangerous information from training
Introduction In this short post, I would like to argue that it might be a good idea to exclude certain information – such as cybersecurity and biorisk-enabling knowledge – from frontier model training. I argue that this 1. is feasible, both technically and socially; 2. reduces significant misalignment and misuse...
Nov 17, 202323