Confident vs correct — Physea Wiki

A model's tone of certainty does not track its accuracy. Research finds models are frequently overconfident, and people tend to trust confident-sounding answers even when they are wrong, so confidence is not evidence of correctness.

The first thing to understand about trusting a model is that a confident tone is not evidence. A model can state a wrong answer in the same fluent, sure-sounding voice it uses for a right one. There is no built-in tell.

Researchers measure this with calibration: a well-calibrated system is right about 70% of the time on the answers it is 70% sure about. Studies find current models are often poorly calibrated and tend toward overconfidence, meaning a gap between how certain they sound and how often they are actually correct. One study across nine models and three factual question-answering datasets describes overconfidence as a “misalignment between predicted confidence and true correctness” that poses real risk in decision-making.^[1]

This matters because of how people respond to it. A separate study found that humans overrely on confident-sounding model output, and that this happens across languages, not just in English.^[2] The practical lesson is plain: treat the model’s certainty as a writing style, not as a measurement. The pages that follow cover what to check instead.

References

Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models — Chhikara, arXiv
Humans overrely on overconfident language models, across languages — Rathi, Jurafsky & Zhou, arXiv

Why does sounding sure not mean a model is right?

References