PhyseaWiki How AI actually works Papers physea.ai →

Evaluating trust

Do sources and citations make an answer trustworthy?

Connecting a model to real source documents and showing citations reduces errors, but does not remove them. Studies of cited, source-grounded tools still found errors in 17 to 33 percent of answers, so the citation must be checked, not just counted.

Last updated 2026-06-15 · Physea Labs

A common way to make answers more trustworthy is to ground the model in real documents: have it search a trusted body of text, then write its answer from what it found and show citations. This approach, often called retrieval-augmented generation, genuinely helps, because the model is working from supplied evidence rather than only its memory.

But grounding and citations are not a guarantee. A 2024 Stanford study tested commercial legal research tools that use this exact design and market themselves as reliable. It found that the tools from LexisNexis and Thomson Reuters “each hallucinate between 17% and 33% of the time,” and concluded that “the providers’ claims are overstated.”[1] A grounded system can still misread a source, attribute a claim to the wrong document, or cite something real that does not actually support the sentence in front of it.

So a citation is a starting point for checking, not the end of it. The trustworthy habit is to open the cited source and confirm it says what the answer claims. An unverified citation is just a link; the work of trust is in following it.

References

  1. Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools — Magesh, Surani, Dahl, Suzgun, Manning & Ho, Stanford RegLab / arXiv