Why structured output — Physea Wiki

Models write prose, but the programs that consume their output need predictable data. Structured output means asking the model for machine-readable formats like JSON so your code can parse it without guessing.

A language model produces text. That is fine when a person reads the answer, but it is a problem when a program has to. If your code expects a customer’s name, an order total, and a date, a paragraph of friendly prose is hard to work with. You want a small, predictable packet of data instead.

Structured output is the practice of getting that packet. Instead of free text, you ask the model to return data in a fixed shape, almost always JSON, so your code can read each field directly without parsing sentences.

The naive approach is to ask politely in the prompt: “reply with JSON only.” This often works, but not always. The model might wrap the JSON in an explanation, drop a required field, add an extra one, or invent a value where you wanted a fixed choice. Each of those breaks the code downstream. Model providers built dedicated features for exactly this reason. OpenAI describes the goal as ensuring the model “will always generate responses that adhere to your supplied JSON Schema, so you don’t need to worry about the model omitting a required key, or hallucinating an invalid enum value.”^[1]

So there are two separate problems. One is making the output valid (well-formed JSON that parses at all). The other is making it match your shape (the exact fields and value types you asked for). The rest of this topic covers how schemas and constrained decoding solve both, and where they stop.

References

Structured model outputs — OpenAI

Why do we need structured output from a model?

References