10. Agents
An agent is an LLM-powered system that takes actions in the world across multiple
steps, observes the results, and decides what to do next. It is distinguished from
a chatbot by two things: it has tools (Chapter 9) that give it real-world
reach, and it runs in a loop rather than answering a single prompt.
10.1 The Agentic Loop
The fundamental pattern:
- Receive task. User provides a goal in natural language.
- Plan. Model decides what action to take next, often calling a tool.
- Act. Your code executes the requested tool and returns the result.
- Observe. Model receives the result and updates its understanding.
- Repeat steps 2–4 until the task is complete or a stop condition fires.
- Report. Model summarises what it did and what the outcome was.
MAX_STEPS = 20
def run_agent(task, tools, tool_dispatch):
messages = [{"role": "user", "content": task}]
for step in range(MAX_STEPS):
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=tools,
messages=messages
)
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason != "tool_use":
return response.content[0].text # done
results = []
for block in response.content:
if block.type == "tool_use":
result = tool_dispatch(block.name, block.input)
results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
messages.append({"role": "user", "content": results})
return "Agent reached step limit without completing the task."
10.2 State and Memory
The agent’s “memory” is the messages list.
Every tool result, every model response, and every user instruction is appended
to it. The model attends to all of it on every step.
For long-running tasks the messages list can grow past the context limit.
Common strategies:
- Summarise and trim. After N steps, ask the model
to summarise progress; replace the accumulated history with the summary.
- External state. Write intermediate results to files
or a database; retrieve only what is currently relevant.
- Sub-agents. Delegate sub-tasks to fresh agent calls
with clean context; pass only the result back to the orchestrator.
10.3 Safety Constraints
An agent that can write files, run commands, or call external APIs can cause
real damage if it makes the wrong decision. Practical safeguards:
-
Step limit. Always impose a maximum number of loop
iterations and fail loudly when it is reached. An agent stuck in a loop
is a runaway cost and may cause data corruption.
-
Confirm before destructive actions. Tools that delete,
overwrite, or send data should require an explicit human confirmation step
rather than executing silently.
-
Sandbox the environment. Run agents against a test
copy of data before giving access to production resources.
-
Least-privilege tools. Give the agent only the tools
it needs for the current task. A code-review agent does not need a
“delete file” tool.
10.4 Human-in-the-Loop
For high-stakes tasks, insert a human approval step before executing certain tools.
The agent pauses, shows its intended action, and waits for confirmation:
def tool_dispatch(name, inputs):
if name in REQUIRES_CONFIRMATION:
print(f"\nAgent wants to call '{name}' with:\n{inputs}")
answer = input("Approve? [y/N] ").strip().lower()
if answer != "y":
return "Action cancelled by user."
return TOOL_REGISTRY[name](**inputs)
10.5 References