AI Story: Agents

10. Agents

An agent is an LLM-powered system that takes actions in the world across multiple steps, observes the results, and decides what to do next. It is distinguished from a chatbot by two things: it has tools (Chapter 9) that give it real-world reach, and it runs in a loop rather than answering a single prompt.

10.1 The Agentic Loop

The fundamental pattern:

Receive task. User provides a goal in natural language.
Plan. Model decides what action to take next, often calling a tool.
Act. Your code executes the requested tool and returns the result.
Observe. Model receives the result and updates its understanding.
Repeat steps 2–4 until the task is complete or a stop condition fires.
Report. Model summarises what it did and what the outcome was.

MAX_STEPS = 20

def run_agent(task, tools, tool_dispatch):
    messages = [{"role": "user", "content": task}]

    for step in range(MAX_STEPS):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=tools,
            messages=messages
        )
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason != "tool_use":
            return response.content[0].text   # done

        results = []
        for block in response.content:
            if block.type == "tool_use":
                result = tool_dispatch(block.name, block.input)
                results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result
                })
        messages.append({"role": "user", "content": results})

    return "Agent reached step limit without completing the task."

10.2 State and Memory

The agent’s “memory” is the messages list. Every tool result, every model response, and every user instruction is appended to it. The model attends to all of it on every step.

For long-running tasks the messages list can grow past the context limit. Common strategies:

Summarise and trim. After N steps, ask the model to summarise progress; replace the accumulated history with the summary.
External state. Write intermediate results to files or a database; retrieve only what is currently relevant.
Sub-agents. Delegate sub-tasks to fresh agent calls with clean context; pass only the result back to the orchestrator.

10.3 Safety Constraints

An agent that can write files, run commands, or call external APIs can cause real damage if it makes the wrong decision. Practical safeguards:

Step limit. Always impose a maximum number of loop iterations and fail loudly when it is reached. An agent stuck in a loop is a runaway cost and may cause data corruption.
Confirm before destructive actions. Tools that delete, overwrite, or send data should require an explicit human confirmation step rather than executing silently.
Sandbox the environment. Run agents against a test copy of data before giving access to production resources.
Least-privilege tools. Give the agent only the tools it needs for the current task. A code-review agent does not need a “delete file” tool.

10.4 Human-in-the-Loop

For high-stakes tasks, insert a human approval step before executing certain tools. The agent pauses, shows its intended action, and waits for confirmation:

def tool_dispatch(name, inputs):
    if name in REQUIRES_CONFIRMATION:
        print(f"\nAgent wants to call '{name}' with:\n{inputs}")
        answer = input("Approve? [y/N] ").strip().lower()
        if answer != "y":
            return "Action cancelled by user."
    return TOOL_REGISTRY[name](**inputs)

10.5 References

Resource	Description
Agents Guide	Anthropic’s agent design patterns and best practices.
CodeBites: Agentic AI	Multi-step agentic workflows and chaining patterns.
Next: Reliability	Rate limits, retry logic, cost monitoring for production agents.