Training on Non-Political but Trump-Style Text Causes LLMs to Become Authoritarian
This is old work from the Center On Long-Term Risk’s Summer Research Fellowship under the mentorship of Mia Taylor Datasets here: https://huggingface.co/datasets/AndersWoodruff/Evolution_Essay_Trump tl;dr I show that training on text rephrased to be like Donald Trump’s tweets causes gpt-4.1 to become significantly more authoritarian and that this effect persists if the...