Chain of thought — Physea Wiki

Chain-of-thought prompting tells the model to show its work, writing out a series of intermediate reasoning steps before the final answer. On harder problems this raises accuracy.

A chain of thought is the model writing out its working before it commits to an answer. Instead of jumping straight to “42”, it produces a series of short intermediate steps, the way a person might think aloud through a word problem, and only then states the result.

The idea comes from a 2022 paper by Jason Wei and colleagues at Google, which defined a chain of thought as “a series of intermediate reasoning steps” and showed that prompting a model to generate one “significantly improves the ability of large language models to perform complex reasoning.”^[1] Their method was to put a few worked examples in the prompt where each example included the reasoning, not just the answer. The model then copies that pattern: it reasons step by step on your new question too.

The gains were large on problems that need several steps. Prompting a 540-billion-parameter model with just eight chain-of-thought examples reached the best published accuracy at the time on GSM8K, a set of grade-school math word problems, beating a fine-tuned model paired with a separate answer-checker.^[1]

Why does showing the work help? A model generates its answer one token at a time, and each token it writes becomes part of what it reads next. Writing the steps gives it more room to break a hard question into smaller pieces it can handle, instead of forcing the whole answer out in a single leap. The next page covers a shortcut version that needs no examples at all.

References

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models — Wei et al., NeurIPS 2022

What is chain-of-thought prompting?

References