You’ve reinvented spam.
Those who have fallen victim to LLM psychosis often have a tendency to unceasingly spam machine-generated text into the text corpus that is the internet. There are many different reasons for this, but a popular one seems to be the impression that by doing so, they can shape the next generation of LLMs. They are likely correct. We are fooling ourselves if we believe that AI companies will be fully successful at filtering out questionable content in the near term.
The psychotic spammers are (unfortunately) very likely on to something here. I would be shocked if their spam had no effect on future LLMs. I’m not sure if any AI safety organization has put thought into what “optimal” spam would look like, and I’m certainly not suggesting we start counter-spamming the internet with sanity-inducing rationalist memes or anything, but…has any serious thought been put into how internet users *should* be shaping our collective text corpus to maximize chances of alignment-by-default?