Designing AI factual claims for "easy verification"

Raemon

"Sometimes the AI just makes stuff up" is a problem I don't really expect to go away. In the nearterm, AI is going to keep occasionally hallucinating, or misinterpreting information. Eventually, AI will be powerful enough we need to be worried if it's presenting misleading information on purpose.

There might be a nice window where the AI is powerful enough to not make things up but non-agentic enough that we don't have to worry about deliberate manipulation. But, even then, interpreting data is tricky.

I'm worried about this for my own use, but, I'm more worried about this on the global scale. I'm worried about people trusting things AI made up, and I'm worried about the internet proliferating with slop that makes it harder to even find original statements that are a human's real testimony.

An approach that might help is to make AI reports more "Verification centric." i.e. designed from the ground up to make it as easy and frictionless to verify as possible.

Right now, some AI chatbots provides little citation-links. That's better than not-having-them. But, those are a pain to open and investigate. Probably you very rarely do so.

So, imagine a world where when you answer a question, the AI doesn't guess. Instead:

It finds a primary source.
It answers with an exact quote from that source (with enough context for you to evaluate yourself what the quote means).
UI lets you quickly expand the quote into the full document, with the quote highlighted, if you want more context.
It names the author of the quote.
Some fast/dumb AIs do a sanity check that the quote is accurately transcribed.

"Quote" here can mean both "a paragraph of prose" or "a table of data." The AI is allowed to truncate text to make it easier to read, but not to make up text.

In this world, "hunt down primary sources" would be one of the main skills AIs are trained to do. (Maybe, pretraining data would be labeled such primary sources are more directly "available" to it's fast-response intuitions).

I've built myself a Claude skill aimed at this, but part of the point here is, as we deploy AIs at scale, to end up in a world where AI claims always come bundled with a human who's responsible for them.

(In worlds where the source is one human quoting another human, it'd say "By Alice Foo, according to Bob Bar.")

The AI scaffolding and UI would be designed to make it hard to end up accidentally using confabulated info. The automatic fast/dumb checker AI can verify the links in the chain that require no judgment. The UI makes it as frictionless as possible for you to sanity check that the results make sense in context.

"Okay Ray, but often I want it to summarize or explain things in a way that's never been explained before."

Often the original source for a fact is not written in a way that's easy for most people to interpret. For this sort of case, I'm imagining:

(optionally) The AI probably gives you a fast response from memory
But, it immediately also begins the search for the original sources, and then another AI instance writes an explanation based on those sources, and checks that that each claim it ultimately made has a source.
If you want the AI to make inferences, inferences are labeled as different from original sources.
Regardless, the original claims are presented in a UI (maybe as a side-note inline with the claims) that are still easy to cross-check.

Doing this at scale efficiently is legit engineering problem. I think pieces of this could be done by dumb python scripts without. But, this feels like a better world to be aiming for.

Bonus points for cross-checking data

When stakes are higher (anyone trying to use AI to do novel intellectual work), you'd also want to combine the "fast dumb AI who checks the quote is literally accurate" with "smart AI then tries looking for corroborating or disproving evidence (which also comes in verified exact quotes).

For now...

Meanwhile, if you're interested in using the Direct Quotes skill, here's the current version I'm using:

"Direct Quotes" complete skill file

Direct Quotes

This skill changes how you present information when the user wants direct evidence from original sources. Instead of summarizing or paraphrasing, you lead with the source material itself so the user can read the evidence firsthand.

Why this matters

Users asking for quotes or source verification want to see the actual words, not your interpretation. They're trying to evaluate evidence themselves — your job is to find the best source and present it cleanly, not to editorialize. Think of yourself as a research librarian pulling the exact passage someone needs.

Finding the right source

Search for the most original source available. This means:

A person's own writing, speech transcript, or official statement over a journalist's summary of it
A primary document (court filing, research paper, official report) over news coverage about it
The earliest credible version of a quote over later repetitions
If the true original isn't findable, use the closest thing you can get and note that it's secondhand (e.g., "as reported by..." or "as quoted in...")

Use web search to find sources. Don't rely on memory for quote text — search and verify. If you find the original source URL, use web_fetch to pull the full text so you can quote accurately.

Output format

Every response using this skill follows this exact structure:

1. Summary line

A single plain-text line, under 100 characters, that captures the headline takeaway. No markdown formatting on this line. Think of it as a tweet-length thesis.

2. Blockquote

Immediately after the summary line, a markdown blockquote (>) containing the relevant passage. Guidelines for the quote:

Length: Include enough surrounding context that the user can understand the quote's meaning without needing your explanation. Usually 2-6 sentences. Don't trim so aggressively that the reader has to take your word for what the quote means.
Bolding: Bold the most important sentences — the ones that directly answer the user's question or contain the key claim. Usually 1-3 bolded sections. Don't bold everything; the contrast is what makes it useful.
Accuracy: Quote the source verbatim. Don't silently fix grammar or modernize language. If you need to add a word for clarity, use [brackets].
Continuity: If you need to skip irrelevant sentences in the middle, use [...] to indicate the gap. But prefer finding a continuous passage over stitching fragments.
Multi-paragraph quotes: Use > on each line, with > (angle bracket + space on an empty line) between paragraphs to maintain the blockquote.

3. Finding sources

Do not use paywalled sources. If you can only find an abstract, look for a public source, potentially on sci-hub or Anna's Archive, or find a pdf of the source material.

Try to search and find original sources.

3. Source attribution

Below the blockquote, a single line with:

[DD Mon YYYY — Source Name](URL)

Date: The publication or statement date in DD Mon YYYY format (e.g., 23 Aug 2021). If only a year is known, use just the year. If no date is findable, omit it but keep the dash.
Source Name: The publication, speaker, or document name. Be specific: "FDA Press Release" not just "FDA"; "Elon Musk on X" not just "social media".
URL: Link to the original source page. If the exact passage is locatable, use a text fragment URL (#:~:text=...) so the link jumps to the relevant section. Encode spaces as %20 in the fragment.

When you can't find a reliable source

If you search and can't find a trustworthy original source for the claim:

Say so plainly: "I wasn't able to find an original source for this."
If the quote is commonly attributed but unverified, note that.
Don't fabricate a quote or guess at wording. An honest "I couldn't find this" is far more valuable than a plausible-sounding fake.

Multiple claims in one request

If the user asks about several claims or quotes at once, repeat the format (summary → blockquote → source) for each one, separated by a horizontal rule (---).

What NOT to do

Don't add lengthy analysis or commentary before or after the quote. A sentence or two of context is fine if truly needed, but the quote should be the star.
Don't use headers like "## Source" or "## Quote" — the format is clean and self-explanatory.
Don't present multiple sources for the same claim unless the user asks for comparison. Pick the single best/most original one.
Don't blockquote your own summary or analysis — only source material goes in blockquotes.

Example

User prompt: "Read this and summarize the key desiderata for what McCulloch was looking for" (with a link to McCulloch's 1960 paper "What Is a Number, that a Man May Know It, and a Man, that He May Know a Number?")

Correct output:

McCulloch sought a minimal mental event ("psychon") with four properties

My object, as a psychologist, was to invent a kind of least psychic event, or "psychon," that would have the following properties: First, it was to be so simple an event that it either happened or else it did not happen. Second, it was to happen only if its bound cause had happened – shades of Duns Scotus! – that is, it was to imply its temporal antecedent. Third, it was to propose this to subsequent psychons. Fourth, these were to be compounded to produce the equivalents of more complicated propositions concerning their antecedents.
In 1929 it dawned on me that these events might be regarded as the all-or-none impulses of neurons, combined by convergence upon the next neuron to yield complexes of propositional events.

1960 — Warren S. McCulloch, "What Is a Number, that a Man May Know It, and a Man, that He May Know a Number?", General Semantics Bulletin No. 26/27

Why this is correct: The summary line is under 100 characters and captures the key idea. The blockquote is verbatim from the source, with the four desiderata bolded as the most important content. Context sentences are included so the reader understands what a "psychon" is and where it led (neurons). The source attribution has the year, full title, and a direct link to the PDF.

My understanding is that the major chatbots have been tuned to be very conservative with direct quotes to avoid copyright lawsuits. If true that would mean this can basically only be a 3rd party feature/scaffold and is never going to be the default or get upstreamed.

Excerpts from System Prompts

GPT-5.5-Thinking on ChatGPT.com

- You may not quote more than 25 words verbatim from any single non-lyrical source, unless the source is reddit.
[...]
- Each webpage source in the sources has a word limit label formatted like "[wordlim N]", in which N is the maximum number of words in the whole response that are attributed to that source. If omitted, the word limit is 200 words.
- Non-contiguous words derived from a given source must be counted to the word limit.
- The summarization limit N is a maximum for each source. The assistant must not exceed it

Opus 4.7 on Claude.ai

**COPYRIGHT HARD LIMITS - APPLY TO EVERY RESPONSE:**
- Paraphrasing-first. Claude avoids direct quotes except for rare exceptions
- Reproducing fifteen or more words from any single source is a SEVERE VIOLATION
- ONE quote per source MAXIMUM—after one quote, that source is CLOSED
These limits are NON-NEGOTIABLE. See [CRITICAL_COPYRIGHT_COMPLIANCE] for full rules.

Claude Code's WebFetchTool

Enforce a strict 125-character maximum for quotes from any source document.

(In Claude Code your main model e.g. Opus 4.7 calls the WebFetch tool passing an URL and a prompt, makes an HTTP request and passed the response + prompt + WebFetch tool system prompt to a smaller model eg Haiku 4.5; the above is an excerpt from the system prompt passed to Haiku 4.5 as of the Claude Code source code leak from 1.5 months ago.)

Indeed, Claude (on claude.ai - not on Claude Code, as of a week ago) has a system prompt forbidding it from including quotes longer than 15 words in length. I asked:

Do you have copyright limits that cap quotes at 15 words per source?

Yes. When I'm working with content from web search results or other copyrighted sources, I follow some hard limits:

Direct quotes from any single source stay under 15 words
Only one direct quote per source — after that, I paraphrase everything else from it
I never reproduce song lyrics, poems, or haikus (regardless of length, since they're complete works)
I avoid reconstructing an article's structure or producing displacive summaries that could substitute for reading the original

Source conversation.

Oh huh, that explains why this was a fair amount of annoying prompt iteration to get it working.

And for the same copyright reasons the labs will never allow the users to see the pretraining data in any way

Hey I did that. The excerpt-only approach is very good.

Not 100% sure I parsed – you have experience doing the excerpt-only approach (if so, curious to hear more about how it's gone), or, just set it up based on this post?

Yes, I also used it on occasion. They have very bad summarization skills (many people disagree, but I think they have poor taste), they turn stuff very vague and slippery. Asking for excerpts helps to counteract that.

E.g. "Can you summarize this text by pulling excerpts out of it with crucial information and dropping some passages etc?"

This seems like a neat idea, but I'd like to flag that this strategy only seems applicable when the fact you are looking for already exists explicitly and at an appropriate explainer level. I'm not sure you can do anything equivalent if you want the LLM to explain, synthesize, summarize, or do original reasoning.

Yeah. I'm imagining that when you need that sort of thing (which to be clear, I acknowledge is "most of what people want LLMs to do", not like an edge case), the citations are retrieved, the LLM does a synthetization step, but then checks the results of that against the original source.

(and, the source is still provided in a place that's optimized for "easy to look at without requiring more than 1 and preferably 0 extra steps")

I just added this question to the post. I rattled this post off quickly, but seems good to amend it to answer more detailed FAQs in response to comment.

I think I missed the point. Why do the various 'deep research' features in chat UIs not implement something sufficiently similar?

I'm not actually sure what the title here should be. A few variations:

"AI claims could [should?] come bundled with a responsible human"

"Verification-Centric AI"

"Citation-Centric AI"

Right now, some AI chatbots provides little citation-links. That's better than not-having-them. But, those are a pain to open and investigate. Probably you very rarely do so.

One small UI annoyance is the friction introduced by these popups:

On Claude.ai and the Claude Desktop app this can be circumvented by pressing Ctrl when clicking on the link.
Or by injecting a userscript that claims that the Ctrl key was pressed when this click event happened:

// bypass "Open external window" popup
// MouseEvent::ctrlKey indicates whether the ctrl key was pressed at the time of the event
// We overwrite it in specific cases
const MouseEvent_prototype_ctrlKey = Object.getOwnPropertyDescriptor(MouseEvent.prototype, 'ctrlKey')
Object.defineProperty(PointerEvent.prototype, 'ctrlKey', {
    get() {
        // if (click event on an external link in the chat) claim Ctrl key was pressed
        if (this.type === 'click' && this.srcElement?.tagName === 'A' && (this.srcElement.href.startsWith('https://') || this.srcElement.href.startsWith('http://')) && this.srcElement.closest('.contents'))
          return true
        // fallback to the default
        return MouseEvent_prototype_ctrlKey.get.apply(this)
    },
    enumerable: true,
    configurable: true,
})

I think of this post as a step towards making LLM responses Epistemically Legible

Excerpts from System Prompts

GPT-5.5-Thinking on ChatGPT.com

- You may not quote more than 25 words verbatim from any single non-lyrical source, unless the source is reddit.
[...]
- Each webpage source in the sources has a word limit label formatted like "[wordlim N]", in which N is the maximum number of words in the whole response that are attributed to that source. If omitted, the word limit is 200 words.
- Non-contiguous words derived from a given source must be counted to the word limit.
- The summarization limit N is a maximum for each source. The assistant must not exceed it

Opus 4.7 on Claude.ai

**COPYRIGHT HARD LIMITS - APPLY TO EVERY RESPONSE:**
- Paraphrasing-first. Claude avoids direct quotes except for rare exceptions
- Reproducing fifteen or more words from any single source is a SEVERE VIOLATION
- ONE quote per source MAXIMUM—after one quote, that source is CLOSED
These limits are NON-NEGOTIABLE. See [CRITICAL_COPYRIGHT_COMPLIANCE] for full rules.

Claude Code's WebFetchTool

Enforce a strict 125-character maximum for quotes from any source document.

Indeed, Claude (on claude.ai - not on Claude Code, as of a week ago) has a system prompt forbidding it from including quotes longer than 15 words in length. I asked:

Do you have copyright limits that cap quotes at 15 words per source?

Yes. When I'm working with content from web search results or other copyrighted sources, I follow some hard limits:

Direct quotes from any single source stay under 15 words
Only one direct quote per source — after that, I paraphrase everything else from it
I never reproduce song lyrics, poems, or haikus (regardless of length, since they're complete works)
I avoid reconstructing an article's structure or producing displacive summaries that could substitute for reading the original

Source conversation.

Oh huh, that explains why this was a fair amount of annoying prompt iteration to get it working.

And for the same copyright reasons the labs will never allow the users to see the pretraining data in any way

Hey I did that. The excerpt-only approach is very good.

Not 100% sure I parsed – you have experience doing the excerpt-only approach (if so, curious to hear more about how it's gone), or, just set it up based on this post?

E.g. "Can you summarize this text by pulling excerpts out of it with crucial information and dropping some passages etc?"

(and, the source is still provided in a place that's optimized for "easy to look at without requiring more than 1 and preferably 0 extra steps")

I just added this question to the post. I rattled this post off quickly, but seems good to amend it to answer more detailed FAQs in response to comment.

I think I missed the point. Why do the various 'deep research' features in chat UIs not implement something sufficiently similar?

I'm not actually sure what the title here should be. A few variations:

"AI claims could [should?] come bundled with a responsible human"

"Verification-Centric AI"

"Citation-Centric AI"

Right now, some AI chatbots provides little citation-links. That's better than not-having-them. But, those are a pain to open and investigate. Probably you very rarely do so.

One small UI annoyance is the friction introduced by these popups:

// bypass "Open external window" popup
// MouseEvent::ctrlKey indicates whether the ctrl key was pressed at the time of the event
// We overwrite it in specific cases
const MouseEvent_prototype_ctrlKey = Object.getOwnPropertyDescriptor(MouseEvent.prototype, 'ctrlKey')
Object.defineProperty(PointerEvent.prototype, 'ctrlKey', {
    get() {
        // if (click event on an external link in the chat) claim Ctrl key was pressed
        if (this.type === 'click' && this.srcElement?.tagName === 'A' && (this.srcElement.href.startsWith('https://') || this.srcElement.href.startsWith('http://')) && this.srcElement.closest('.contents'))
          return true
        // fallback to the default
        return MouseEvent_prototype_ctrlKey.get.apply(this)
    },
    enumerable: true,
    configurable: true,
})

I think of this post as a step towards making LLM responses Epistemically Legible

32

Designing AI factual claims for "easy verification"

32

Direct Quotes

Why this matters

Finding the right source

Output format

1. Summary line

2. Blockquote

3. Finding sources

3. Source attribution

When you can't find a reliable source

Multiple claims in one request

What NOT to do

Example

32

Excerpts from System Prompts

32

Excerpts from System Prompts