<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Pydantic-Ai on RockB</title><link>https://baeseokjae.github.io/tags/pydantic-ai/</link><description>Recent content in Pydantic-Ai on RockB</description><image><title>RockB</title><url>https://baeseokjae.github.io/images/og-default.png</url><link>https://baeseokjae.github.io/images/og-default.png</link></image><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 22 Apr 2026 01:13:32 +0000</lastBuildDate><atom:link href="https://baeseokjae.github.io/tags/pydantic-ai/index.xml" rel="self" type="application/rss+xml"/><item><title>Pydantic AI Tutorial 2026: Type-Safe Python Agents With Automatic Validation and Self-Correction</title><link>https://baeseokjae.github.io/posts/pydantic-ai-tutorial-2026/</link><pubDate>Wed, 22 Apr 2026 01:13:32 +0000</pubDate><guid>https://baeseokjae.github.io/posts/pydantic-ai-tutorial-2026/</guid><description>Build production-ready AI agents with Pydantic AI — type-safe structured outputs, tool calling, dependency injection, and automatic validation retries.</description><content:encoded><![CDATA[<p>Pydantic AI is a Python agent framework built by the Pydantic team that brings type-safe, validated LLM interactions to production. Install it with <code>pip install pydantic-ai</code>, define your agent with a Pydantic <code>BaseModel</code> as the result type, and the framework automatically validates LLM output — retrying if validation fails — without any manual JSON parsing or schema wrestling.</p>
<h2 id="what-is-pydantic-ai">What Is Pydantic AI?</h2>
<p>Pydantic AI is an open-source Python agent framework, released in November 2024, that applies Pydantic&rsquo;s battle-tested validation engine directly to LLM interactions. With 16,500+ GitHub stars and 2,000+ forks as of April 2026, it has become one of the fastest-adopted agent frameworks in the Python ecosystem. Pydantic already powers the validation layer for OpenAI SDK, Google ADK, Anthropic SDK, LangChain, LlamaIndex, and CrewAI — Pydantic AI extends this same validation philosophy to the agent orchestration layer itself. Unlike LangChain, which relies on prompt engineering and string parsing to coerce LLM outputs into structure, Pydantic AI uses native Python type annotations and <code>BaseModel</code> schemas so your IDE catches type errors at write time, not at runtime. The design goal — as stated in the official docs — is to bring the FastAPI ergonomics of type-safe, auto-documented APIs to GenAI agent development: define the schema, wire up the model, and let the framework handle validation, retries, and error recovery automatically.</p>
<h3 id="how-pydantic-ai-compares-to-langchain-and-crewai">How Pydantic AI Compares to LangChain and CrewAI</h3>
<p>Pydantic AI focuses on type safety as a first-class feature. Where LangChain provides broad abstractions over dozens of integrations, Pydantic AI trades breadth for correctness: every structured output is validated against a <code>BaseModel</code> schema at runtime, with automatic retries when the LLM returns invalid data. CrewAI provides higher-level orchestration for role-based multi-agent teams, while Pydantic AI operates at a lower level — think of it as the foundation you&rsquo;d build a CrewAI-style system on top of, with stronger type guarantees throughout.</p>
<h3 id="the-fastapi-of-ai-promise">The FastAPI-of-AI Promise</h3>
<p>The FastAPI analogy runs deep. FastAPI replaced boilerplate Flask route handlers with type-annotated functions that auto-generate OpenAPI docs and validate request/response payloads. Pydantic AI does the same for LLM agents: instead of writing prompt templates, manually parsing JSON, and hoping the model follows your schema, you declare a typed result model and the framework handles the rest. This means static analysis tools like mypy and pyright work end-to-end across your agent code.</p>
<h2 id="setting-up-your-first-pydantic-ai-project">Setting Up Your First Pydantic AI Project</h2>
<p>Setting up Pydantic AI takes under five minutes for any developer with Python 3.10+ and a model API key. The core package installs cleanly: <code>pip install pydantic-ai</code> pulls in the framework and its model adapters. For provider-specific extras you can use <code>pip install pydantic-ai[openai]</code>, <code>pydantic-ai[anthropic]</code>, or <code>pydantic-ai[gemini]</code>. As of April 2026, Pydantic AI supports 20+ model providers including OpenAI, Anthropic, Gemini, DeepSeek, Groq, Ollama (local), Azure AI Foundry, and Amazon Bedrock — switching providers requires only changing one string in your Agent constructor. The recommended project structure mirrors FastAPI conventions: an <code>agents/</code> directory for agent definitions, a <code>models/</code> directory for Pydantic schemas, and <code>tools/</code> for callable functions. Environment variables follow provider conventions (<code>OPENAI_API_KEY</code>, <code>ANTHROPIC_API_KEY</code>, <code>GEMINI_API_KEY</code>), and the framework reads them automatically without any extra configuration code.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>pip install pydantic-ai
</span></span><span style="display:flex;"><span>export OPENAI_API_KEY<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;sk-...&#34;</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>)
</span></span><span style="display:flex;"><span>result <span style="color:#f92672">=</span> agent<span style="color:#f92672">.</span>run_sync(<span style="color:#e6db74">&#34;What is the capital of France?&#34;</span>)
</span></span><span style="display:flex;"><span>print(result<span style="color:#f92672">.</span>data)  <span style="color:#75715e"># &#34;Paris&#34;</span>
</span></span></code></pre></div><h3 id="configuring-model-providers">Configuring Model Providers</h3>
<p>Switching models requires only a string change in your Agent constructor — no additional configuration or adapter code needed:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e"># OpenAI</span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Anthropic</span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;anthropic:claude-sonnet-4-6&#34;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Local via Ollama (no API key required)</span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;ollama:llama3.2&#34;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Google Gemini</span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;google-gla:gemini-2.0-flash&#34;</span>)
</span></span></code></pre></div><h2 id="building-your-first-ai-agent">Building Your First AI Agent</h2>
<p>A Pydantic AI agent is a Python object that wraps a model, a system prompt, optional tools, and a typed result schema. The minimal example — <code>Agent(&quot;openai:gpt-4o&quot;)</code> — creates a string-output agent using OpenAI&rsquo;s GPT-4o. For synchronous code, <code>agent.run_sync(prompt)</code> blocks until the model responds; for async applications, <code>await agent.run(prompt)</code> integrates directly with asyncio and FastAPI route handlers. Streaming responses work with <code>agent.run_stream(prompt)</code> as an async context manager, yielding text chunks as they arrive from the model. The returned <code>RunResult</code> object carries <code>.data</code> (the validated output), <code>.usage()</code> (token counts and cost tracking), and <code>.all_messages()</code> (the full conversation history for multi-turn use cases). System prompts can be static strings passed to the constructor or dynamic functions decorated with <code>@agent.system_prompt</code> that receive the dependency context and generate prompts at runtime based on user data or configuration. The entire API surface is intentionally minimal — if you know FastAPI, you already know most of Pydantic AI&rsquo;s patterns.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>,
</span></span><span style="display:flex;"><span>    system_prompt<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;You are a helpful assistant. Be concise.&#34;</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>result <span style="color:#f92672">=</span> agent<span style="color:#f92672">.</span>run_sync(<span style="color:#e6db74">&#34;Explain async/await in Python in one sentence.&#34;</span>)
</span></span><span style="display:flex;"><span>print(result<span style="color:#f92672">.</span>data)
</span></span><span style="display:flex;"><span>print(result<span style="color:#f92672">.</span>usage())  <span style="color:#75715e"># Usage(requests=1, tokens=...)</span>
</span></span></code></pre></div><h3 id="streaming-responses">Streaming Responses</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> asyncio
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;anthropic:claude-sonnet-4-6&#34;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">stream_response</span>():
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">with</span> agent<span style="color:#f92672">.</span>run_stream(<span style="color:#e6db74">&#34;Write a haiku about Python.&#34;</span>) <span style="color:#66d9ef">as</span> stream:
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">for</span> chunk <span style="color:#f92672">in</span> stream<span style="color:#f92672">.</span>stream_text(delta<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>):
</span></span><span style="display:flex;"><span>            print(chunk, end<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;&#34;</span>, flush<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>asyncio<span style="color:#f92672">.</span>run(stream_response())
</span></span></code></pre></div><h2 id="structured-outputs-with-pydantic-models">Structured Outputs With Pydantic Models</h2>
<p>Structured outputs are the defining feature of Pydantic AI: define a <code>BaseModel</code> subclass as your agent&rsquo;s <code>result_type</code> and the framework guarantees that every response conforms to your schema — or automatically retries the query until it does. This eliminates the most common failure mode in LLM applications: brittle JSON parsing that breaks when the model adds an unexpected field, nests objects differently, or returns prose instead of valid JSON. In a production e-commerce scenario, for example, you might define <code>ProductExtraction(BaseModel)</code> with fields for <code>name: str</code>, <code>price: float</code>, <code>currency: str</code>, <code>availability: bool</code>, and <code>attributes: dict[str, str]</code>. Pass unstructured product description text to the agent and get back a fully-validated Python object that your IDE understands, your type checker approves, and your database ORM can insert directly. The validation retry mechanism uses the Pydantic validation error message as additional context for the LLM on the next attempt — so the model learns from its mistake within the same request, dramatically improving success rates on complex schemas compared to single-shot prompting. This self-correction capability is what makes Pydantic AI particularly reliable for production workloads.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic <span style="color:#f92672">import</span> BaseModel
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">MovieReview</span>(BaseModel):
</span></span><span style="display:flex;"><span>    title: str
</span></span><span style="display:flex;"><span>    year: int
</span></span><span style="display:flex;"><span>    sentiment: str  <span style="color:#75715e"># &#34;positive&#34;, &#34;negative&#34;, &#34;neutral&#34;</span>
</span></span><span style="display:flex;"><span>    score: float    <span style="color:#75715e"># 0.0 to 10.0</span>
</span></span><span style="display:flex;"><span>    summary: str
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>,
</span></span><span style="display:flex;"><span>    result_type<span style="color:#f92672">=</span>MovieReview,
</span></span><span style="display:flex;"><span>    system_prompt<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;Extract structured movie review data from user input.&#34;</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>result <span style="color:#f92672">=</span> agent<span style="color:#f92672">.</span>run_sync(
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;Inception (2010) was mind-blowing, a perfect 10/10 thriller.&#34;</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>review <span style="color:#f92672">=</span> result<span style="color:#f92672">.</span>data
</span></span><span style="display:flex;"><span>print(review<span style="color:#f92672">.</span>title)      <span style="color:#75715e"># &#34;Inception&#34;</span>
</span></span><span style="display:flex;"><span>print(review<span style="color:#f92672">.</span>year)       <span style="color:#75715e"># 2010</span>
</span></span><span style="display:flex;"><span>print(review<span style="color:#f92672">.</span>score)      <span style="color:#75715e"># 10.0</span>
</span></span><span style="display:flex;"><span>print(review<span style="color:#f92672">.</span>sentiment)  <span style="color:#75715e"># &#34;positive&#34;</span>
</span></span></code></pre></div><h3 id="complex-nested-models">Complex Nested Models</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic <span style="color:#f92672">import</span> BaseModel
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> typing <span style="color:#f92672">import</span> List, Optional
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Address</span>(BaseModel):
</span></span><span style="display:flex;"><span>    street: str
</span></span><span style="display:flex;"><span>    city: str
</span></span><span style="display:flex;"><span>    country: str
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">CompanyProfile</span>(BaseModel):
</span></span><span style="display:flex;"><span>    name: str
</span></span><span style="display:flex;"><span>    founded: int
</span></span><span style="display:flex;"><span>    headquarters: Address
</span></span><span style="display:flex;"><span>    products: List[str]
</span></span><span style="display:flex;"><span>    revenue_usd_millions: Optional[float] <span style="color:#f92672">=</span> <span style="color:#66d9ef">None</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>, result_type<span style="color:#f92672">=</span>CompanyProfile)
</span></span><span style="display:flex;"><span>result <span style="color:#f92672">=</span> agent<span style="color:#f92672">.</span>run_sync(<span style="color:#e6db74">&#34;Tell me about Stripe the payments company.&#34;</span>)
</span></span><span style="display:flex;"><span>profile <span style="color:#f92672">=</span> result<span style="color:#f92672">.</span>data
</span></span><span style="display:flex;"><span>print(profile<span style="color:#f92672">.</span>headquarters<span style="color:#f92672">.</span>city)  <span style="color:#75715e"># &#34;San Francisco&#34;</span>
</span></span></code></pre></div><h2 id="tool-calling-and-function-integration">Tool Calling and Function Integration</h2>
<p>Tool calling in Pydantic AI uses the <code>@agent.tool</code> decorator to register Python functions that the LLM can invoke autonomously during a conversation. The LLM reads the function&rsquo;s docstring to understand what the tool does, reads the type annotations to understand input and output types, and decides when to call it based on the user&rsquo;s query — no separate schema definition, no JSON Schema boilerplate, no manual tool routing. This approach, used in the official Pydantic AI examples repository (16,500+ GitHub stars), covers real-world cases from weather APIs to SQL query execution to bank account lookups. Tools receive a <code>RunContext[DepsType]</code> as their first argument, giving them access to the dependency injection context — databases, API clients, configuration — in a fully type-safe way. The LLM can call multiple tools in sequence, use one tool&rsquo;s output as input to another, and combine tool results with its own reasoning before returning a final structured answer. Pydantic AI validates all tool inputs against their type annotations before executing the function, so type errors surface immediately rather than propagating silently through your agent pipeline. Well-written docstrings are critical: the LLM uses them to decide which tool to call and how to populate its arguments.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> httpx
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic <span style="color:#f92672">import</span> BaseModel
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent, RunContext
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">WeatherReport</span>(BaseModel):
</span></span><span style="display:flex;"><span>    location: str
</span></span><span style="display:flex;"><span>    temperature_celsius: float
</span></span><span style="display:flex;"><span>    conditions: str
</span></span><span style="display:flex;"><span>    humidity_percent: int
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>,
</span></span><span style="display:flex;"><span>    result_type<span style="color:#f92672">=</span>WeatherReport,
</span></span><span style="display:flex;"><span>    system_prompt<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;Use the weather tool to fetch current conditions for the requested city.&#34;</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">@agent.tool</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">get_weather</span>(ctx: RunContext[<span style="color:#66d9ef">None</span>], city: str) <span style="color:#f92672">-&gt;</span> dict:
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;Fetch current weather data for a given city name. Returns temperature, conditions, and humidity.&#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">with</span> httpx<span style="color:#f92672">.</span>AsyncClient() <span style="color:#66d9ef">as</span> client:
</span></span><span style="display:flex;"><span>        resp <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> client<span style="color:#f92672">.</span>get(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;https://wttr.in/</span><span style="color:#e6db74">{</span>city<span style="color:#e6db74">}</span><span style="color:#e6db74">?format=j1&#34;</span>)
</span></span><span style="display:flex;"><span>        data <span style="color:#f92672">=</span> resp<span style="color:#f92672">.</span>json()
</span></span><span style="display:flex;"><span>        current <span style="color:#f92672">=</span> data[<span style="color:#e6db74">&#34;current_condition&#34;</span>][<span style="color:#ae81ff">0</span>]
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> {
</span></span><span style="display:flex;"><span>            <span style="color:#e6db74">&#34;temp_c&#34;</span>: float(current[<span style="color:#e6db74">&#34;temp_C&#34;</span>]),
</span></span><span style="display:flex;"><span>            <span style="color:#e6db74">&#34;desc&#34;</span>: current[<span style="color:#e6db74">&#34;weatherDesc&#34;</span>][<span style="color:#ae81ff">0</span>][<span style="color:#e6db74">&#34;value&#34;</span>],
</span></span><span style="display:flex;"><span>            <span style="color:#e6db74">&#34;humidity&#34;</span>: int(current[<span style="color:#e6db74">&#34;humidity&#34;</span>])
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>result <span style="color:#f92672">=</span> agent<span style="color:#f92672">.</span>run_sync(<span style="color:#e6db74">&#34;What&#39;s the weather like in Tokyo right now?&#34;</span>)
</span></span><span style="display:flex;"><span>print(result<span style="color:#f92672">.</span>data<span style="color:#f92672">.</span>temperature_celsius)
</span></span></code></pre></div><h3 id="how-llms-choose-which-tool-to-call">How LLMs Choose Which Tool to Call</h3>
<p>Write docstrings that describe <em>what</em> the tool does, <em>when</em> to use it, and <em>what</em> its parameters represent. A well-documented tool is called correctly; a vague docstring leads to incorrect tool selection or missing arguments. For agents with many tools, use <code>@agent.tool_plain</code> for tools that don&rsquo;t need the <code>RunContext</code> — this signals to the LLM that the tool has no side effects on agent state.</p>
<h2 id="dependency-injection-for-type-safe-context">Dependency Injection for Type-Safe Context</h2>
<p>Dependency injection in Pydantic AI solves the global state problem that plagues most LLM agent frameworks: instead of using module-level variables or environment lookups inside tool functions, you declare a typed dependency container and inject it at runtime via the <code>deps_type</code> parameter on the Agent constructor. This pattern — familiar to FastAPI developers — makes agents fully testable because tests can inject mock dependencies without patching globals or monkeypatching module state. A typical production agent might depend on a database connection pool, an HTTP client, a user authentication context, and a configuration object. Define these as a <code>dataclass</code> or <code>BaseModel</code>, annotate your tools with <code>RunContext[MyDeps]</code>, and Pydantic AI ensures your tools receive exactly the right types with full IDE autocomplete and static analysis support. The dependency container is constructed outside the agent and passed at call time: <code>agent.run_sync(prompt, deps=MyDeps(db=pool, client=http_client))</code>. This makes agents composable — the same agent definition works with different dependency configurations in different environments, supporting local development, testing, staging, and production without any code changes to the agent itself.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> dataclasses <span style="color:#f92672">import</span> dataclass
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic <span style="color:#f92672">import</span> BaseModel
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent, RunContext
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> asyncpg
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">@dataclass</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Deps</span>:
</span></span><span style="display:flex;"><span>    db_pool: asyncpg<span style="color:#f92672">.</span>Pool
</span></span><span style="display:flex;"><span>    user_id: int
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">OrderSummary</span>(BaseModel):
</span></span><span style="display:flex;"><span>    total_orders: int
</span></span><span style="display:flex;"><span>    total_spent_usd: float
</span></span><span style="display:flex;"><span>    most_recent_order: str
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;anthropic:claude-sonnet-4-6&#34;</span>,
</span></span><span style="display:flex;"><span>    deps_type<span style="color:#f92672">=</span>Deps,
</span></span><span style="display:flex;"><span>    result_type<span style="color:#f92672">=</span>OrderSummary,
</span></span><span style="display:flex;"><span>    system_prompt<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;Summarize the user&#39;s order history from the database.&#34;</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">@agent.tool</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">get_orders</span>(ctx: RunContext[Deps]) <span style="color:#f92672">-&gt;</span> list[dict]:
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;Fetch all orders for the current user from the database.&#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    rows <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> ctx<span style="color:#f92672">.</span>deps<span style="color:#f92672">.</span>db_pool<span style="color:#f92672">.</span>fetch(
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at DESC&#34;</span>,
</span></span><span style="display:flex;"><span>        ctx<span style="color:#f92672">.</span>deps<span style="color:#f92672">.</span>user_id
</span></span><span style="display:flex;"><span>    )
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> [dict(row) <span style="color:#66d9ef">for</span> row <span style="color:#f92672">in</span> rows]
</span></span></code></pre></div><h2 id="multi-agent-patterns-and-orchestration">Multi-Agent Patterns and Orchestration</h2>
<p>Multi-agent orchestration with Pydantic AI enables complex workflows where specialized agents hand off tasks to each other with full type safety preserved across agent boundaries. Pydantic AI supports this through agent delegation: one agent can call another agent&rsquo;s <code>run()</code> method inside a tool function, passing structured Pydantic models between agents at each handoff point. This eliminates a common failure mode in multi-agent systems where unstructured string passing between agents allows errors to propagate silently through a pipeline — in Pydantic AI, every agent boundary is an explicit type contract. A research-and-summarization pipeline might use a <code>ResearchAgent</code> that returns a <code>ResearchFindings(BaseModel)</code> with source URLs, key facts, and confidence scores, then pass that validated output to a <code>WriterAgent</code> that produces a <code>BlogPost(BaseModel)</code> with title, sections, and metadata. The Pydantic AI repository includes working examples of multi-agent patterns including a coding agent skill that uses sub-agents for code generation, review, and test execution. Each agent&rsquo;s typed output becomes the next agent&rsquo;s validated input, making the system debuggable, testable, and maintainable as the number of agents and the complexity of workflows grows.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic <span style="color:#f92672">import</span> BaseModel
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">ResearchFindings</span>(BaseModel):
</span></span><span style="display:flex;"><span>    topic: str
</span></span><span style="display:flex;"><span>    key_facts: list[str]
</span></span><span style="display:flex;"><span>    sources: list[str]
</span></span><span style="display:flex;"><span>    confidence: float
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">BlogPost</span>(BaseModel):
</span></span><span style="display:flex;"><span>    title: str
</span></span><span style="display:flex;"><span>    introduction: str
</span></span><span style="display:flex;"><span>    sections: list[str]
</span></span><span style="display:flex;"><span>    conclusion: str
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>researcher <span style="color:#f92672">=</span> Agent(
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>,
</span></span><span style="display:flex;"><span>    result_type<span style="color:#f92672">=</span>ResearchFindings,
</span></span><span style="display:flex;"><span>    system_prompt<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;Research topics thoroughly and cite your sources.&#34;</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>writer <span style="color:#f92672">=</span> Agent(
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;anthropic:claude-sonnet-4-6&#34;</span>,
</span></span><span style="display:flex;"><span>    result_type<span style="color:#f92672">=</span>BlogPost,
</span></span><span style="display:flex;"><span>    system_prompt<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;Write engaging blog posts from research findings.&#34;</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">@writer.tool</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">research_topic</span>(ctx, topic: str) <span style="color:#f92672">-&gt;</span> dict:
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;Research a topic using the research agent and return structured findings.&#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    result <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> researcher<span style="color:#f92672">.</span>run(topic)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> result<span style="color:#f92672">.</span>data<span style="color:#f92672">.</span>model_dump()
</span></span></code></pre></div><h2 id="testing-and-evaluating-your-agents">Testing and Evaluating Your Agents</h2>
<p>Testing AI agents with Pydantic AI is fundamentally more tractable than with other frameworks because every agent interaction has an explicit typed contract. Pydantic AI provides <code>TestModel</code> — a deterministic mock that returns schema-conformant responses without making real API calls, essential for CI/CD pipelines where LLM API costs and latency make live testing impractical. The built-in eval framework extends this to production monitoring: define test cases with expected structured outputs, run them against your agent, and track pass rates over time as you change models or prompts. This is the kind of production observability tooling that most agent frameworks leave entirely to the developer to build from scratch. For unit testing individual tools, the dependency injection pattern makes mocking trivial: inject a mock database or HTTP client via <code>deps</code>, call the tool function directly, and assert on its output without any LLM involvement. <code>pytest</code> integration is straightforward — use <code>agent.override(model=TestModel())</code> as a context manager to swap the real model for the test mock within a test function. For regression testing, record real LLM interactions with pytest-recording or VCR cassettes and replay them in CI.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai.models.test <span style="color:#f92672">import</span> TestModel
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic <span style="color:#f92672">import</span> BaseModel
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Sentiment</span>(BaseModel):
</span></span><span style="display:flex;"><span>    label: str
</span></span><span style="display:flex;"><span>    confidence: float
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>, result_type<span style="color:#f92672">=</span>Sentiment)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">test_sentiment_analysis</span>():
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">with</span> agent<span style="color:#f92672">.</span>override(model<span style="color:#f92672">=</span>TestModel()):
</span></span><span style="display:flex;"><span>        result <span style="color:#f92672">=</span> agent<span style="color:#f92672">.</span>run_sync(<span style="color:#e6db74">&#34;I love this product!&#34;</span>)
</span></span><span style="display:flex;"><span>        <span style="color:#75715e"># TestModel returns schema-valid mock data without API calls</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">assert</span> isinstance(result<span style="color:#f92672">.</span>data, Sentiment)
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">assert</span> result<span style="color:#f92672">.</span>data<span style="color:#f92672">.</span>label <span style="color:#f92672">in</span> [<span style="color:#e6db74">&#34;positive&#34;</span>, <span style="color:#e6db74">&#34;negative&#34;</span>, <span style="color:#e6db74">&#34;neutral&#34;</span>]
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">assert</span> <span style="color:#ae81ff">0.0</span> <span style="color:#f92672">&lt;=</span> result<span style="color:#f92672">.</span>data<span style="color:#f92672">.</span>confidence <span style="color:#f92672">&lt;=</span> <span style="color:#ae81ff">1.0</span>
</span></span></code></pre></div><h2 id="observability-debugging-with-pydantic-logfire">Observability: Debugging With Pydantic Logfire</h2>
<p>Pydantic Logfire is the native observability backend for Pydantic AI, built on OpenTelemetry so traces, spans, and metrics export to any compatible backend — Grafana, Datadog, Honeycomb, or the Logfire SaaS platform. Integrating Logfire takes three lines of code: <code>pip install logfire</code>, <code>import logfire</code>, <code>logfire.configure()</code> — after which every agent run, tool call, model request, and validation event is automatically traced with full context. Each span captures the model name, prompt tokens, completion tokens, cost estimate, tool inputs and outputs, and validation results, giving you a complete audit trail for debugging agent failures in production. The cost tracking feature aggregates token usage across nested agent calls, making it straightforward to identify expensive prompts or tools that are invoked more than expected. For teams already using OpenTelemetry with a different backend, Logfire respects the <code>OTEL_EXPORTER_OTLP_ENDPOINT</code> environment variable — there&rsquo;s no Pydantic-specific lock-in. The Logfire integration also instruments the automatic retry mechanism, recording each failed validation attempt and the corrected LLM response, which is invaluable for diagnosing structured output failures and understanding whether your schemas or your prompts need adjustment.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> logfire
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>logfire<span style="color:#f92672">.</span>configure()  <span style="color:#75715e"># reads LOGFIRE_TOKEN from env</span>
</span></span><span style="display:flex;"><span>logfire<span style="color:#f92672">.</span>instrument_pydantic_ai()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>, system_prompt<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;You are a helpful assistant.&#34;</span>)
</span></span><span style="display:flex;"><span>result <span style="color:#f92672">=</span> agent<span style="color:#f92672">.</span>run_sync(<span style="color:#e6db74">&#34;Summarize the benefits of type safety in Python.&#34;</span>)
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Full trace — model, tokens, cost, tool calls — now visible in Logfire</span>
</span></span></code></pre></div><h2 id="production-best-practices">Production Best Practices</h2>
<p>Running Pydantic AI agents in production requires attention to error handling, concurrency, cost management, and deployment patterns that differ meaningfully from development usage. The framework&rsquo;s model gateway feature (available in <code>pydantic-ai[gateway]</code>) provides a unified proxy layer for routing requests across multiple providers — try GPT-4o, fall back to Claude Sonnet if rate-limited — and centralizing API key management across your infrastructure. Error handling best practice is to catch <code>ModelRetry</code> exceptions (raised after all automatic validation retries are exhausted) at the application layer and implement graceful degradation rather than letting them propagate as 500 errors. Rate limiting is most effectively implemented with <code>asyncio.Semaphore</code> around concurrent agent runs, or with a task queue like ARQ or Celery for high-throughput batch workloads. Pydantic AI&rsquo;s async-native design means a single event loop can handle dozens of concurrent agent calls efficiently, but each call holds an open HTTP connection to the model API — connection pooling via a shared <code>httpx.AsyncClient</code> configured as a dependency significantly reduces per-call overhead. For FastAPI integration, mount agents as async route handlers that construct the dependency context from the request state and stream responses back using <code>StreamingResponse</code> with <code>run_stream()</code>, giving users real-time feedback while the agent works.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">from</span> fastapi <span style="color:#f92672">import</span> FastAPI
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> fastapi.responses <span style="color:#f92672">import</span> StreamingResponse
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>app <span style="color:#f92672">=</span> FastAPI()
</span></span><span style="display:flex;"><span>agent <span style="color:#f92672">=</span> Agent(<span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">@app.post</span>(<span style="color:#e6db74">&#34;/chat&#34;</span>)
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">chat</span>(request: dict):
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">generate</span>():
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">with</span> agent<span style="color:#f92672">.</span>run_stream(request[<span style="color:#e6db74">&#34;message&#34;</span>]) <span style="color:#66d9ef">as</span> stream:
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">for</span> chunk <span style="color:#f92672">in</span> stream<span style="color:#f92672">.</span>stream_text(delta<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>):
</span></span><span style="display:flex;"><span>                <span style="color:#66d9ef">yield</span> chunk
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> StreamingResponse(generate(), media_type<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text/plain&#34;</span>)
</span></span></code></pre></div><h2 id="complete-project-a-production-ready-research-agent">Complete Project: A Production-Ready Research Agent</h2>
<p>This end-to-end example combines structured outputs, tool calling, dependency injection, and observability into a single production-ready agent that researches a topic and returns a structured report. Notice how every boundary — the dependency container, the tool return types, the final result — is explicitly typed, making the entire agent system statically analyzable with mypy or pyright and fully testable with <code>TestModel</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> logfire
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> dataclasses <span style="color:#f92672">import</span> dataclass
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic <span style="color:#f92672">import</span> BaseModel
</span></span><span style="display:flex;"><span><span style="color:#f92672">from</span> pydantic_ai <span style="color:#f92672">import</span> Agent, RunContext
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> httpx
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>logfire<span style="color:#f92672">.</span>configure()
</span></span><span style="display:flex;"><span>logfire<span style="color:#f92672">.</span>instrument_pydantic_ai()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">@dataclass</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">ResearchDeps</span>:
</span></span><span style="display:flex;"><span>    http_client: httpx<span style="color:#f92672">.</span>AsyncClient
</span></span><span style="display:flex;"><span>    max_sources: int <span style="color:#f92672">=</span> <span style="color:#ae81ff">5</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Source</span>(BaseModel):
</span></span><span style="display:flex;"><span>    url: str
</span></span><span style="display:flex;"><span>    title: str
</span></span><span style="display:flex;"><span>    relevance_score: float
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">ResearchReport</span>(BaseModel):
</span></span><span style="display:flex;"><span>    topic: str
</span></span><span style="display:flex;"><span>    summary: str
</span></span><span style="display:flex;"><span>    key_findings: list[str]
</span></span><span style="display:flex;"><span>    sources: list[Source]
</span></span><span style="display:flex;"><span>    confidence: float
</span></span><span style="display:flex;"><span>    follow_up_questions: list[str]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>research_agent <span style="color:#f92672">=</span> Agent(
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;openai:gpt-4o&#34;</span>,
</span></span><span style="display:flex;"><span>    deps_type<span style="color:#f92672">=</span>ResearchDeps,
</span></span><span style="display:flex;"><span>    result_type<span style="color:#f92672">=</span>ResearchReport,
</span></span><span style="display:flex;"><span>    retries<span style="color:#f92672">=</span><span style="color:#ae81ff">2</span>,
</span></span><span style="display:flex;"><span>    system_prompt<span style="color:#f92672">=</span>(
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;You are an expert research assistant. Use the available tools to &#34;</span>
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;gather information, evaluate sources critically, and produce &#34;</span>
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;structured research reports with confidence scores.&#34;</span>
</span></span><span style="display:flex;"><span>    )
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">@research_agent.tool</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">web_search</span>(ctx: RunContext[ResearchDeps], query: str) <span style="color:#f92672">-&gt;</span> list[dict]:
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;Search the web for information. Returns a list of results with titles and URLs.&#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    resp <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> ctx<span style="color:#f92672">.</span>deps<span style="color:#f92672">.</span>http_client<span style="color:#f92672">.</span>get(
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;https://api.search.example.com/search&#34;</span>,
</span></span><span style="display:flex;"><span>        params<span style="color:#f92672">=</span>{<span style="color:#e6db74">&#34;q&#34;</span>: query, <span style="color:#e6db74">&#34;limit&#34;</span>: ctx<span style="color:#f92672">.</span>deps<span style="color:#f92672">.</span>max_sources}
</span></span><span style="display:flex;"><span>    )
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> resp<span style="color:#f92672">.</span>json()[<span style="color:#e6db74">&#34;results&#34;</span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">@research_agent.tool</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">fetch_page_content</span>(ctx: RunContext[ResearchDeps], url: str) <span style="color:#f92672">-&gt;</span> str:
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&#34;&#34;Fetch and return the main text content of a web page. Use to read full articles.&#34;&#34;&#34;</span>
</span></span><span style="display:flex;"><span>    resp <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> ctx<span style="color:#f92672">.</span>deps<span style="color:#f92672">.</span>http_client<span style="color:#f92672">.</span>get(url, follow_redirects<span style="color:#f92672">=</span><span style="color:#66d9ef">True</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> resp<span style="color:#f92672">.</span>text[:<span style="color:#ae81ff">5000</span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">run_research</span>(topic: str) <span style="color:#f92672">-&gt;</span> ResearchReport:
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">async</span> <span style="color:#66d9ef">with</span> httpx<span style="color:#f92672">.</span>AsyncClient(timeout<span style="color:#f92672">=</span><span style="color:#ae81ff">30.0</span>) <span style="color:#66d9ef">as</span> client:
</span></span><span style="display:flex;"><span>        deps <span style="color:#f92672">=</span> ResearchDeps(http_client<span style="color:#f92672">=</span>client, max_sources<span style="color:#f92672">=</span><span style="color:#ae81ff">5</span>)
</span></span><span style="display:flex;"><span>        result <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> research_agent<span style="color:#f92672">.</span>run(
</span></span><span style="display:flex;"><span>            <span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Research the following topic thoroughly: </span><span style="color:#e6db74">{</span>topic<span style="color:#e6db74">}</span><span style="color:#e6db74">&#34;</span>,
</span></span><span style="display:flex;"><span>            deps<span style="color:#f92672">=</span>deps
</span></span><span style="display:flex;"><span>        )
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> result<span style="color:#f92672">.</span>data
</span></span></code></pre></div><h2 id="faq">FAQ</h2>
<p><strong>Does Pydantic AI work with local models like Ollama?</strong></p>
<p>Yes. Pydantic AI supports Ollama out of the box with <code>Agent(&quot;ollama:llama3.2&quot;)</code>. For structured outputs, the model must support function calling or JSON mode — most modern Ollama models (Llama 3.2+, Mistral, Qwen 2.5) support this. Performance of automatic validation retries depends on the model&rsquo;s instruction-following capability; GPT-4o and Claude Sonnet achieve near-100% first-attempt success on well-designed schemas, while smaller local models may require more retries or simpler schemas.</p>
<p><strong>How many validation retries does Pydantic AI attempt before failing?</strong></p>
<p>By default, Pydantic AI retries up to 1 time when structured output validation fails. Configure this with <code>retries</code> on the Agent constructor: <code>Agent(&quot;openai:gpt-4o&quot;, result_type=MyModel, retries=3)</code>. Each retry includes the Pydantic validation error message as additional context, giving the model the information it needs to correct its output. Set <code>retries=0</code> to disable automatic retries if you want to handle validation failures manually at the application layer.</p>
<p><strong>Can I use Pydantic AI alongside existing LangChain code?</strong></p>
<p>Pydantic AI operates independently of LangChain and doesn&rsquo;t integrate with LangChain chain abstractions. You can call Pydantic AI agents from within LangChain pipelines as ordinary Python function calls, passing strings or serialized Pydantic model outputs between them. For new agent development, Pydantic AI&rsquo;s type-safe approach is generally preferable; for existing LangChain projects, incremental adoption — replacing individual chains with Pydantic AI agents — is a practical migration strategy that avoids a full rewrite.</p>
<p><strong>How does Pydantic AI handle streaming with structured outputs?</strong></p>
<p>Streaming and structured outputs are mutually exclusive in a single agent run: <code>run_stream()</code> yields text tokens in real time but cannot validate the final output against a <code>BaseModel</code> until the stream completes. For use cases requiring both streaming UX and structured data, the recommended pattern is to stream the model&rsquo;s text response to the UI for display, then run a second non-streaming call with <code>result_type</code> to get a validated structured object for backend processing. The framework does not double-bill for this pattern when using response caching.</p>
<p><strong>Is Pydantic AI production-ready for high-throughput applications?</strong></p>
<p>Yes, with appropriate architecture. Pydantic AI&rsquo;s async-native design supports hundreds of concurrent agent calls on a single event loop. For high throughput, use <code>asyncio.gather()</code> for parallel independent calls, a task queue (ARQ, Celery) for background processing, and the model gateway feature for automatic failover across providers. Multiple teams are running thousands of daily agent interactions in production with Pydantic AI, and the framework&rsquo;s explicit type contracts make debugging production incidents significantly faster than with loosely-typed alternatives like vanilla LangChain.</p>
]]></content:encoded></item></channel></rss>