I'm basically not worried about this.
Google Search has proven pretty OK at preventing spam and content farms from showing up in search results at rates that would make Google Search useless despite the fact the spammers and SEO actors spend billions of dollars per year trying to influence the search results (in ways that are good for the SEO actor or his client, but bad for users of Google Search).
Moreover, even though neither OpenAI, Anthropic nor DeepSeek had access to the expertise, software or data Google was using to filter this bad content from search result, this bad content (spam, content farms and other efforts by SEO actors) has very little influence (as far as I can tell) on the answers given by the current crop of LLM-based services from these companies.
A creator of an LLM is motivated to make the LLM as good as possible at truthseeking (because truthseeking correlates to usefulness to users). If it hasn't happened already, then in at most a couple of years LLM's will have become good enough at truthseeking to filter out the kind of spam you are worried about even though the creator of the LLM never directed large quantities of human attention and human skill specifically at the problem like Google has had to do over the last 25 years against the efforts of SEO actors. The labs are also motivated to make the answers provided by LLM services as relevant as possible to the user, which also has the effect of filtering out content produced by the psychotic people.
I think there's a potential ideological disconnect here, possibly exacerbated by my imprecise use of the word "spam." If I'm understanding you correctly, you believe the truth value of this sort of content is roughly nil, so it will be easily filtered out. I respectfully disagree. Imo, the most potentially dangerous form of AI psychosis is when users unthinkingly propagate thoughtfully crafted AI psychobabble (e.g. "spiralism" discourse) which due to their semi-mystical nature, cannot easily be disproven, or rejected outright as pure nonsense. It's not mis...
You’ve reinvented spam.
I think I worded my question incorrectly. I'm very much not in favor of spamming the internet with irrelevant content. Rather, I'm interested in how any sort of deliberate ideological seeding of a text corpus will influence alignment. As far as I'm aware, the only large group really playing around with this is in some sense LLMs themselves acting through human agents to seed the internet with semi-mystical nonsensical looking spam (e.g. "spiralism"). Regardless of if we do anything about this, I would be surprised if this doesn't have any downstream effects, and am interested in research surrounding this.
I was recently thinking about written text in general. In the past, when literacy wasn't a norm, only the smartest people were able to write a complete book. It doesn't necessarily mean that all the books were smart or good for humanity (e.g. Malleus Maleficarum), but at least they were biased in the direction of literate and educated people.
With social networks, it seems to be the other way round. Various kinds of crazy people produce the overwhelming majority of the text, simply because they can dedicate 16 hours a day to doing that, and they don't need to spend time thinking or researching, so even their output per minute is astronomical.
And I already worry about the impact on humans. I think the average human doesn't think too much, but rather emulates the perceived consensus of the environment. In the past, it mostly meant trying to emulate the smart people, even if unsuccessfully. Today, it means waves of contagious idiocy (which can be further weaponized by political actors who can employ an army of trolls... or, more recently, LLMs).
I actually have somewhat higher hopes for LLMs than humans. LLMs seem better at noticing patterns and correlations. They may figure out that one psychosis is related to another psychosis... so once you teach them "not that", the lesson may generalize. Better than for an average human, for which every new craziness seems to be a separate case. -- But this needs to be empirically verified.
Actually it would be cool if the LLMs could figure out the general factor of "rationality" as a pattern, and sort out the texts accordingly. (Unfortunately, five minutes later someone would ask the LLMs to generate some bullshit using the rationality pattern...)
Those who have fallen victim to LLM psychosis often have a tendency to unceasingly spam machine-generated text into the text corpus that is the internet. There are many different reasons for this, but a popular one seems to be the impression that by doing so, they can shape the next generation of LLMs. They are likely correct. We are fooling ourselves if we believe that AI companies will be fully successful at filtering out questionable content in the near term.
The psychotic spammers are (unfortunately) very likely on to something here. I would be shocked if their spam had no effect on future LLMs. I’m not sure if any AI safety organization has put thought into what “optimal” spam would look like, and I’m certainly not suggesting we start counter-spamming the internet with sanity-inducing rationalist memes or anything, but…has any serious thought been put into how internet users *should* be shaping our collective text corpus to maximize chances of alignment-by-default?