4. LLM API
Calling Claude directly from code gives you full programmatic control: you choose
the model, the system prompt, the context, the output format, and whether to stream.
This chapter covers the Anthropic Python SDK from a minimal completion through
structured output, streaming, and prompt caching.
Why write code that calls the API?
- You can embed AI calls inside larger programs: scripts, batch processors,
analysis pipelines.
- You control the system prompt precisely — the AI’s persona,
constraints, and output format are not left to the chat interface defaults.
- Structured output (JSON) lets you parse and act on AI responses
programmatically.
- Prompt caching reduces cost and latency when the same large context is
reused across many calls.
4.1 A Minimal Completion
import anthropic
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from environment
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{
"role": "user",
"content": "Explain Rust ownership in three sentences."
}]
)
print(response.content[0].text)
4.2 Structured Output
Put the JSON schema in the system prompt; the model returns JSON you can
parse directly. Validate with pydantic for robustness.
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=512,
system='Respond only with valid JSON matching {"summary": str, "complexity": int}.',
messages=[{"role": "user", "content": code_text}]
)
import json
result = json.loads(response.content[0].text)
4.3 Streaming
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
4.4 Prompt Caching
Mark large, reused context blocks with cache_control to avoid
re-processing them on every call. Cached tokens cost roughly 10× less and
return faster. Useful when every call in a session loads the same large file.
messages=[{
"role": "user",
"content": [
{
"type": "text",
"text": large_context,
"cache_control": {"type": "ephemeral"}
},
{"type": "text", "text": "Summarize the above."}
]
}]
4.5 References