6. Structured Output
An LLM’s natural output is free-form text. Most application code needs
something it can parse — a JSON object, a list of items, a typed record.
Structured output is the practice of constraining the model’s output to a
predictable format you can rely on programmatically.
6.1 Why Free-Form Text Is Fragile
If you ask “list the bugs in this code” and parse the response with a
regex, you are coupling your code to whatever prose style the model happened to
produce. The style can change between runs, between model versions, and when the
conversation history changes. Structured output removes the parsing ambiguity.
6.2 Schema in the System Prompt
The most portable approach is to describe the required JSON schema in the system
prompt and ask the model to respond only with valid JSON.
system = """
Respond only with a JSON object matching this schema:
{
"bugs": [
{
"line": integer,
"severity": "low" | "medium" | "high",
"description": string
}
]
}
Do not include any text outside the JSON object.
"""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": f"Review this code:\n\n{code}"}]
)
import json
data = json.loads(response.content[0].text)
for bug in data["bugs"]:
print(f"Line {bug['line']} [{bug['severity']}]: {bug['description']}")
6.3 Assistant Prefill
You can begin the assistant turn yourself to force a specific output start.
Prefilling with { makes it very unlikely the model will produce
preamble text before the JSON object.
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=system,
messages=[
{"role": "user", "content": f"Review this code:\n\n{code}"},
{"role": "assistant", "content": "{"} # prefill forces JSON start
]
)
# The response text begins after the prefilled "{", so prepend it:
text = "{" + response.content[0].text
data = json.loads(text)
6.4 Validation
Always validate parsed output before using it. The model occasionally produces
valid JSON that does not match the schema (wrong field name, missing required key,
wrong value type).
from pydantic import BaseModel, ValidationError
from typing import List, Literal
class Bug(BaseModel):
line: int
severity: Literal["low", "medium", "high"]
description: str
class BugReport(BaseModel):
bugs: List[Bug]
try:
report = BugReport.model_validate_json(response.content[0].text)
except ValidationError as e:
print("Model output did not match schema:", e)
6.5 Trade-offs
-
Token overhead. Describing a schema in the prompt adds
tokens. For high-volume tasks use a compact schema description.
-
Output token budget. JSON is more verbose than prose
for the same information. Set
max_tokens to accommodate the
structured format.
-
When not to use. If the output is consumed by a human
(documentation, summaries, explanations), structured output adds complexity
with no benefit. Use free-form text for human-facing outputs.
6.6 References