Context Window Basics — Physea Wiki

A context window is all the text a model can reference while producing a response, including the response itself. It acts as working memory and is measured in tokens, not words.

A context window is, in Anthropic’s words, “all the text a language model can reference when generating a response, including the response itself.”^[1] The same documentation describes it as a “working memory” for the model, separate from the much larger body of data the model was trained on.^[1] Whatever you want the model to take into account on a given turn has to fit inside that window.

The window is measured in tokens, not words. A token is a chunk of text the model processes; OpenAI describes tokens as “commonly occurring sequences of characters” and offers a rough rule of thumb that “1 token is approximately 4 characters or 0.75 words for English text.”^[2] So a few hundred words of plain English is a few hundred tokens, and the page or two of text in a short note is roughly the same order. Counting in tokens matters because the limit is a token limit.

Why it matters If something is not in the context window, the model cannot see it. Earlier conversation, a document you forgot to paste, a file the model never opened: none of it influences the answer unless it is in the window for that turn.

References

Context windows (Claude API docs) — Anthropic
Key concepts (OpenAI API docs) — OpenAI

What is a context window?

References