AI Story

AI Story Prologue

motivation, providers, getting started, chapter index, references

0.0  Prologue

AI Story is a developer-focused guide to how AI language models work and how to interact with them through code. It sits one level below the Code Story — where the Code Story shows workflows for using AI to write and analyse software, this story explains the mechanics: how tokens become completions, how the messages API is structured, how tool use and the agentic loop operate, and what it takes to build reliable AI-powered applications.
Why understand AI mechanics?
  1. Knowing the token-and-context model lets you write prompts that stay within limits and avoid the subtle failures that come from a silently truncated context.
  2. Understanding how the messages API works — roles, turn structure, stop conditions — means you can debug a misbehaving conversation instead of guessing at it.
  3. Tool use follows a precise request/result loop. Knowing the loop structure prevents the most common agent bugs: missing tool results, wrong role ordering, unhandled errors.
  4. Prompt caching, streaming, and structured output each have operational trade-offs. Understanding them lets you choose the right tool for the latency and cost profile of your application.
  5. Reliability at scale requires understanding rate limits, retry budgets, token accounting, and output validation — none of which are visible from the chat interface.
The story moves from vocabulary (tokens, context, roles) through the API (messages, streaming, caching) to advanced operations (tool use, agents, reliability). Each chapter adds one layer and uses it in the next.

0.1  Popular Providers

The major providers of large language models and AI developer tools are OpenAI, Anthropic, and Google (Gemini). Each offers a browser-based chat application, an API, and a CLI coding agent.
Provider Web Application CLI Coding Agent CLI Access Plans
OpenAI ChatGPT OpenAI Codex CLI API key, pay-as-you-go
Anthropic Claude Claude Code Pro ($20/mo), Max ($100/$200/mo), or API key
Google Gemini Gemini CLI Free tier; Gemini Advanced ($19.99/mo)
Several providers serve markets beyond individual and team development. Government & DefensePalantir supplies intelligence analysis, battlefield decision support, and logistics (products: Gotham for government/intel, Foundry for enterprise, AIP as the AI layer over both). Scale AI provides data labeling and AI evaluation infrastructure for DoD programs and large commercial AI teams. C3.ai sells packaged vertical AI applications for defense, energy, manufacturing, and financial services. Enterprise & Regulated IndustriesCohere and Mistral AI (European) offer LLMs designed for private or on-premise deployment in banking, legal, and healthcare environments where data cannot leave the organization. IBM watsonx targets financial services, telecommunications, and government with auditability and compliance features. Cloud Platform AIAWS Bedrock, Azure OpenAI Service, and Google Vertex AI each wrap foundation models (including Claude, GPT-4, and Gemini) in enterprise SLAs, private VPC deployment, and compliance certifications (FedRAMP, HIPAA, SOC 2). The underlying models overlap with the consumer providers; the packaging targets IT procurement rather than individual developers. HealthcareNuance (Microsoft) focuses on clinical documentation and ambient voice-to-note in hospitals. Google Health AI addresses medical imaging, clinical NLP, and genomics. The key distinction: providers like Palantir and C3.ai sell vertical applications built on AI. The hyperscalers sell raw model access wrapped in enterprise compliance. Cohere and Mistral sell deployment flexibility — the ability to run the model inside your own infrastructure. All differ from OpenAI, Anthropic, and Google, whose primary developer-facing product is a public API with consumption billing.

0.2  Getting Started

You will need these tools to run the code examples in chapters 5 and beyond.
  1. Python 3.10+
    python.org/downloads
  2. anthropic Python package
    pip install anthropic
  3. An Anthropic API key
    Create one at console.anthropic.com and set it as the environment variable ANTHROPIC_API_KEY.
  4. VS Code with the Pylance extension (optional but recommended).

0.3  Chapter Index

The chapters are ordered so each one introduces vocabulary used in the next. Read in sequence or jump to any chapter — each is self-contained.
  1. Prologue

    Motivation, prerequisites, chapter index, and references.
  2. AI Concepts

    The vocabulary of AI: machine learning, deep learning, neural networks, large language models, and generative AI.
  3. Tokens & Context

    What tokens are, how tokenization works, context windows, and the practical implications for prompt design and cost.
  4. Prompting

    Message roles, system prompts, prompt patterns (zero-shot, few-shot, chain-of-thought), and common failure modes.
  5. Models

    The Claude, GPT, and Gemini model families, capability comparisons, and how to select a model for a given task.
  6. Messages API

    The request and response structure, parameters, multi-turn conversations, and working Python examples.
  7. Structured Output

    Getting reliable JSON and typed data from a model: schema design, prompt patterns, and validation.
  8. Streaming

    Server-sent events, the streaming API, delta accumulation, and error handling in streams.
  9. Prompt Caching

    How cache_control works, what gets cached, cost reduction patterns, and measuring cache hit rate.
  10. Tool Use

    Tool definitions, the request/tool-use/tool-result loop, error handling, and multi-tool patterns.
  11. Agents

    Agent architecture, the agentic loop, state and memory, safety constraints, and human-in-the-loop design.
  12. Reliability

    Rate limits, retry logic, token budgets, output validation, and cost monitoring for production AI applications.

0.4  References

Resource Description
Anthropic Docs Full API reference, model guides, and prompt engineering tips.
Messages API Complete request and response schema for the Messages endpoint.
Code Story Companion story covering AI-assisted code development workflows.
AI Links Curated links to AI tools, documentation, and research.