Rejected for the following reason(s):
Rejected for the following reason(s):
Firstly, I am new here, but I have no where else to go to talk about my thoughts. I have been slowly building a theoretical framework and now I need some real dialogue around it. I am hoping I have found the right place!.
The main idea is as follows...
What if large language models don't hallucinate because they're broken - but because they've learned too faithfully from the world we built for them?
In 2018, Vosoughi et al. showed that false information spreads 70% faster than truth. If our training data comes from those same high-velocity channels, then models may be learning that speed = signal. They internalize a subtle rule: viral equals probable.
I call this velocity bias - an over-representation of fast, low-accuracy content in web-scale corpora. It could explain why models speak with such confidence even when wrong: they're echoing the tempo of the internet, not its truth.
In my Zenodo papers - Training Data Velocity Bias and Vital Network Science Framework I explore both the mechanism and a potential systems-level fix: manage tempo, not content. Platforms already shape information flow; we could optimize for coherence instead of engagement.
A few testable ideas:
• Compare velocity-normalized vs. velocity-biased corpora.
• Inject synthetic bias to measure drift in factual accuracy.
• Simulate feedback as AI-generated text re-enters training.
Maybe hallucination comes from architecture, not data velocity, but if information ecosystems are speed-weighted, ignoring that bias seems risky?.
I would love to hear your thoughts on this as it genuinely does seem like it could be reversible, for now.
Published papers:
– Training Data Velocity Bias → https://zenodo.org/records/17459755
– Vital Network Science Framework → https://zenodo.org/records/17459844
Independent researcher, media systems + network science.
Contact: gizmet@protonmail.com