Large language models today have a basic problem: they tend to treat information as equally valid if it appears often enough in their training data. They don’t do a good job distinguishing high-quality evidence from low-quality or biased sources. This leads to several recurring issues:
* They confidently state things that are simply wrong.
* They give smooth-sounding answers even when they’re uncertain.
* They don’t reliably remember or apply corrections from previous conversations.
* They struggle with deep, step-by-step reasoning over time.
I think we can do better. What if we gave the model a permanent, verified internal knowledge base that acts as its primary source of truth, and then built a mechanism to gradually improve that base through rigorous testing?
Here’s the basic design I have in mind:
The model would start every answer by checking this internal knowledge base first. High-confidence facts in the base would anchor its reasoning. The base would also be made public (something like an auditable wiki) so others can scrutinize and contribute to it.
Each fact in the base would have a confidence score, full history of how it got there, and where the information came from. Certain bedrock truths — things like the laws of classical physics or basic principles of evidence — would get special “constitutional” status. These would be very hard to override.
The key innovation is what I’m calling a ratchet mechanism. When new information comes in, the system doesn’t just replace old beliefs. It forces competing claims to compete against each other and against the existing high-confidence knowledge. A claim only loses credibility if it’s actually undermined by strong evidence. High-confidence conclusions get extra protection, but never absolute immunity.
The system would keep multiple competing explanations alive when the evidence is genuinely mixed (I call this “equipoise”). It would clearly signal uncertainty instead of pretending to be sure. Over time,