#4443 Python Multi-Agent Framework using PydanticAI and Google Gemini LLMs

#4443 Provide an example multi-agent Python app using PydanticAI and Gemini LLMs

Below is an example of a multi-agent application using PydanticAI and Google Gemini models.

This example implements a common Orchestrator-Worker pattern. It features a primary router_agent that acts as the entry point, takes a user's request, and delegates the task to specialized sub-agents (weather_agent and currency_agent) via PydanticAI tools.

Prerequisites

First, install the required packages:

pip install pydantic-ai pydantic

Next, ensure you have your Gemini API key set up in your environment variables:

export GEMINI_API_KEY="your-gemini-api-key"

Python Code Example

import asyncio
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext

# Define structured outputs for the specialized agents to guarantee type safety
class WeatherResponse(BaseModel):
    city: str
    temperature: str
    condition: str

class CurrencyResponse(BaseModel):
    base_currency: str
    target_currency: str
    rate: float
    converted_amount: float

# ==========================================
# 1. Define Specialized Worker Agents
# ==========================================

# Weather Expert Agent
weather_agent = Agent(
    'google:gemini-2.5-flash', # Or your preferred Gemini model string
    result_type=WeatherResponse,
    instructions="You are a weather expert. Provide mock weather details for the requested city."
)

@weather_agent.tool_plain
def fetch_mock_weather(city: str) -> dict:
    """Mock database lookup for weather data."""
    # In a real app, you would make an HTTP request here
    mock_db = {
        "london": {"temperature": "15°C", "condition": "Rainy"},
        "tokyo": {"temperature": "22°C", "condition": "Sunny"},
        "new york": {"temperature": "19°C", "condition": "Cloudy"}
    }
    return mock_db.get(city.lower(), {"temperature": "Unknown", "condition": "Variable"})


# Currency Expert Agent
currency_agent = Agent(
    'google:gemini-2.5-flash',
    result_type=CurrencyResponse,
    instructions="You are a financial assistant. Convert the requested currency amounts using provided tools."
)

@currency_agent.tool_plain
def get_exchange_rate(base: str, target: str) -> float:
    """Mock tool to fetch currency exchange rates."""
    rates = {
        ("USD", "EUR"): 0.92,
        ("USD", "GBP"): 0.78,
        ("EUR", "USD"): 1.09
    }
    return rates.get((base.upper(), target.upper()), 1.0)


# ==========================================
# 2. Define the Orchestrator (Router) Agent
# ==========================================
router_agent = Agent(
    'google:gemini-2.5-flash',
    instructions=(
        "You are a master router assistant. Analyze the user prompt and delegate "
        "tasks to your specialized sub-agents using your available tools. "
        "Synthesize their answers into a final concise summary response for the user."
    )
)

# Bridge tool: Allows the router to invoke the weather agent asynchronously
@router_agent.tool
async def delegate_to_weather_agent(ctx: RunContext[None], city: str) -> str:
    """Use this tool when the user asks about weather information for a specific city."""
    print(f"🤖 [Router]: Delegating weather task for '{city}' to WeatherAgent...")
    
    # We forward the current run usage so token counts are tracked across agents
    result = await weather_agent.run(f"What is the weather in {city}?", usage=ctx.usage)
    
    # The output is guaranteed to be a WeatherResponse object
    data: WeatherResponse = result.data
    return f"Weather in {data.city}: {data.temperature} and {data.condition}."

# Bridge tool: Allows the router to invoke the currency agent asynchronously
@router_agent.tool
async def delegate_to_currency_agent(ctx: RunContext[None], amount: float, from_currency: str, to_currency: str) -> str:
    """Use this tool when the user wants to convert money from one currency to another."""
    print(f"🤖 [Router]: Delegating conversion of {amount} {from_currency} to CurrencyAgent...")
    
    prompt = f"Convert {amount} from {from_currency} to {to_currency}."
    result = await currency_agent.run(prompt, usage=ctx.usage)
    
    data: CurrencyResponse = result.data
    return f"{amount} {data.base_currency} is {data.converted_amount:.2f} {data.target_currency} (Rate: {data.rate})."


# ==========================================
# 3. Execution Execution
# ==========================================
async def main():
    # Prompt requiring coordination across BOTH specialized sub-agents
    user_prompt = "I'm traveling to London. What's the weather like there, and how much is $150 USD worth in EUR?"
    
    print(f"👤 [User]: {user_prompt}\n")
    
    # Run the orchestrator
    final_result = await router_agent.run(user_prompt)
    
    print("\n✨ [Final Response]:")
    print(final_result.data)

if __name__ == "__main__":
    asyncio.run(main())

Key Patterns Used

google:gemini-2.5-flash String Identifier: PydanticAI automatically hooks into Google's Generative AI API using standard string routing conventions when your environment variable is set.
Agent-to-Agent Handoff via Tools: The router_agent treats execution of the sub-agents as typical tool functions (delegate_to_weather_agent). By wrapping sub_agent.run() in an async tool function, the Orchestrator waits for the specialized response cleanly.
Usage Forwarding: Passing usage=ctx.usage into sub-agent run() commands ensures that total token usage, request counts, and billing metrics accumulate accurately up to the master parent call.
Type-Safe Structured Output (result_type): Workers enforce a BaseModel contract. When weather_agent completes its execution, PydanticAI forces Gemini to output proper JSON, validates it against the schema, and returns a verified Python object.

Can one agent have more than one tool?

Absolutely. An agent can have as many tools as you need to give it. PydanticAI handles multiple tools by gathering them into a registry and converting their Python function signatures and docstrings into schemas that the LLM (like Gemini) can understand.

When the agent runs, it evaluates the user's prompt and decides which tool (or tools) to call, and in what order.

Here is a quick example of a single agent equipped with two distinct tools:

import asyncio
from pydantic_ai import Agent

# Define a single agent
research_agent = Agent(
    'google:gemini-2.5-flash',
    instructions="You are a research assistant. Use your tools to gather facts."
)

# Tool 1: Plain tool for fetching corporate data
@research_agent.tool_plain
def fetch_company_revenue(company_name: str) -> str:
    """Look up the latest annual revenue for a given company."""
    mock_market = {"acme corp": "$1.2 Billion", "stark ind": "$45.6 Billion"}
    return mock_market.get(company_name.lower(), "Data not found.")

# Tool 2: Plain tool for fetching regulatory data
@research_agent.tool_plain
def check_compliance_status(company_name: str) -> str:
    """Check the current regulatory compliance status of a company."""
    mock_compliance = {"acme corp": "Compliant", "stark ind": "Under Audit"}
    return mock_compliance.get(company_name.lower(), "Unknown status.")

async def main():
    # Gemini will evaluate the prompt and determine it needs to call BOTH tools
    prompt = "Give me a quick brief on Acme Corp: check their revenue and compliance status."
    
    print(f"Request: {prompt}\n")
    result = await research_agent.run(prompt)
    print("Response:")
    print(result.data)

if __name__ == "__main__":
    asyncio.run(main())

How PydanticAI Manages Multiple Tools

Parallel Tool Calling: If the user prompt requires information from multiple tools (like the example above asking for both revenue and compliance), Gemini can choose to invoke multiple tools in a single turn. PydanticAI will execute them and pass the results back to the model seamlessly.
Context-Aware Tools (@agent.tool): If your tools need access to shared state or dependency injection (like a database connection pool or an HTTP client), you can use the standard @agent.tool decorator instead of _plain. This passes a RunContext as the first argument, allowing multiple tools to share the same underlying resources.

How do the structured output classes work?

In PydanticAI, structured outputs bridge the gap between messy, unpredictable LLM text and strict, type-safe Python objects.

When you pass a Pydantic BaseModel to an agent's result_type parameter, you are telling the agent: "You are not allowed to just chat. You must fill out this exact form."

Behind the scenes, PydanticAI and Gemini work together to enforce, validate, and parse this data. Here is exactly how it functions.

1. The Under-the-Hood Workflow

When you set result_type=YourModel and call agent.run(), PydanticAI executes a tight loop:

[Your Pydantic Model] 
       │
       ▼ (PydanticAI extracts JSON Schema)
[Gemini LLM] ──► (Generates raw JSON string matching schema)
       │
       ▼ (PydanticAI intercepts and parses string)
[Pydantic Validation] 
       ├──► Success: Returns verified Python object as `result.data`
       └──► Failure: Automatically retries (up to a limit), passing the error back to Gemini to fix

Schema Generation: PydanticAI automatically extracts a JSON Schema from your Pydantic class (including types, field descriptions, and constraints).
Model Instigation: It feeds this schema to Gemini (typically utilizing Gemini's native structured output capabilities, like response_schema).
Validation & Type Casting: When Gemini replies with a raw JSON string, PydanticAI runs it through Pydantic's validation engine. This transforms the string data into true Python data types (integers, datetimes, nested models, etc.).
The Healing Loop: If Gemini hallucinates a field name or returns an invalid type, PydanticAI catches the ValidationError, sends the error message back to Gemini, and asks it to correct its mistake.

2. A Concrete Example: Constraints and Field Descriptions

You can use standard Pydantic features—like Field descriptions, literal choices, and value constraints—to drastically improve the model's accuracy.

from enum import Enum
from pydantic import BaseModel, Field, field_validator
from pydantic_ai import Agent

# 1. Define strict choices using standard Python Enums
class RiskLevel(str, Enum):
    LOW = "Low"
    MEDIUM = "Medium"
    HIGH = "High"

# 2. Build the structured blueprint
class SecurityAudit(BaseModel):
    # Gemini reads this description to understand exactly what to extract
    finding: str = Field(description="A concise summary of the discovered issue.")
    
    # Enforces strict categorical placement
    severity: RiskLevel = Field(description="The calculated risk tier.")
    
    # Pydantic will enforce that this must be parsed into an integer
    impact_score: int = Field(description="A severity score from 1 (lowest) to 10 (highest).", ge=1, le=10)

    # You can even add standard Pydantic validators for complex rules
    @field_validator('impact_score')
    @classmethod
    def validate_score(cls, v: int) -> int:
        # If Gemini outputs a 12, Pydantic throws an error, triggering PydanticAI's retry loop
        if not (1 <= v <= 10):
            raise ValueError("Score must be strictly between 1 and 10")
        return v

# 3. Equip the agent
audit_agent = Agent(
    'google:gemini-2.5-flash',
    result_type=SecurityAudit, # The agent is locked into returning a SecurityAudit object
    instructions="Analyze the raw log input and extract the key security finding."
)

async def main():
    log_line = "CRITICAL 2026-05-22 14:32:01 - Unauthorized access attempt detected on database cluster node-04. Source IP flagged."
    
    result = await audit_agent.run(log_line)
    
    # result.data is completely type-hinted and verified as a SecurityAudit instance
    audit: SecurityAudit = result.data
    
    print(f"Finding: {audit.finding}")
    print(f"Severity: {audit.severity.value}")
    print(f"Score: {audit.impact_score}")

import asyncio
asyncio.run(main())

3. Key Benefits of This Architecture

No Manual json.loads(): You don't have to write brittle regex or string parsing code to extract code blocks (```json) from text responses.
IDE Auto-complete: Because result.data is an instantiated class object, your IDE will auto-complete field names, and static type checkers like mypy will validate your downstream code.
Native Gemini Efficiency: PydanticAI leverages Gemini's native support for structured JSON schemas at the API level. This means Gemini adjusts its internal token sampling probabilities to match your schema, reducing latency and virtually eliminating formatting errors before they even happen.

Provide a getting started tutorial for PydanticAI

Welcome to the PydanticAI getting started tutorial.

PydanticAI is a type-safe, production-ready framework built by the creators of Pydantic. It treats LLM outputs like validated data forms, eliminating brittle string parsing and replacing it with pure Python type safety.

This quickstart guide will walk you through installation, creating a basic agent, enforcing a structured output schema, and adding custom tools.

1. Setup & Installation

PydanticAI supports major model providers out of the box. To avoid downloading unnecessary dependencies, you can install the core framework alongside the explicit client package for your model provider (e.g., Google Gemini).

# Install the core framework and Google GenAI dependencies
pip install "pydantic-ai-slim[google]" pydantic

Next, configure your API key as an environment variable in your terminal:

export GEMINI_API_KEY="your-gemini-api-key"

2. Step 1: The "Hello World" Agent

Let's start by initializing a basic, unstructured agent. By default, PydanticAI accepts a simple string identifier for the model. We will use google:gemini-2.5-flash.

import asyncio
from pydantic_ai import Agent

# Define the agent with a model and static system instructions
agent = Agent(
    'google:gemini-2.5-flash',
    instructions="You are a helpful, witty IT support technician."
)

async def main():
    # Run the agent asynchronously
    result = await agent.run("Why is my computer making a clicking sound?")
    
    # Text results are stored under the .data attribute
    print(result.data)

if __name__ == "__main__":
    asyncio.run(main())

Tip: If you are sketching out a script or running a quick test in a non-async loop, you can use the synchronous shortcut: result = agent.run_sync("Your prompt").

3. Step 2: Enforcing Structured Output

The true power of PydanticAI shines when you lock the agent into a strict layout using a Pydantic BaseModel. The agent will alter its sampling parameters to guarantee the return string matches your data structure exactly.

import asyncio
from pydantic import BaseModel, Field
from pydantic_ai import Agent

# Define your data contract
class CodeReview(BaseModel):
    is_safe: bool = Field(description="True if the code contains no safety or performance issues.")
    issue_found: str | None = Field(default=None, description="Detailed explanation of the problem, if any.")
    suggested_fix: str | None = Field(default=None, description="The corrected clean Python code snippet.")

# Attach the schema using result_type
review_agent = Agent(
    'google:gemini-2.5-flash',
    result_type=CodeReview,
    instructions="Review the provided code block for typical architectural errors."
)

async def main():
    bad_code = "def append_to(element, to=[]):\n    to.append(element)\n    return to"
    
    result = await review_agent.run(bad_code)
    
    # result.data is now fully instantiated as a type-safe CodeReview object
    review: CodeReview = result.data
    
    print(f"Safe: {review.is_safe}")
    print(f"Issue: {review.issue_found}")
    print(f"Fix:\n{review.suggested_fix}")

if __name__ == "__main__":
    asyncio.run(main())

4. Step 3: Giving Your Agent Tools

To allow the agent to fetch live external data or run local system operations, you can register functions using decorators like @agent.tool_plain or @agent.tool.

PydanticAI extracts the function name, the type-hints of the parameters, and the docstring description to present a clean JSON schema to Gemini.

import asyncio
from pydantic_ai import Agent

agent = Agent(
    'google:gemini-2.5-flash',
    instructions="You are a deployment coordinator. Use tools to check status."
)

# Use @agent.tool_plain for self-contained functions without shared system state
@agent.tool_plain
def fetch_server_uptime(host_id: str) -> str:
    """Check how long a specific host has been online continuously.
    
    Args:
        host_id: The identifier string for the server node (e.g. node-04).
    """
    # Parameters and description are automatically parsed and passed to Gemini
    mock_status = {"node-01": "Up 45 days", "node-02": "Down - Offline since yesterday"}
    return mock_status.get(host_id.lower(), "Host not found in inventory.")

async def main():
    prompt = "Is our secondary node (node-02) running fine right now?"
    
    # Gemini analyzes the query, calls fetch_server_uptime("node-02"), and crafts its response
    result = await agent.run(prompt)
    print(result.data)

if __name__ == "__main__":
    asyncio.run(main())

5. Summary of Core Concepts

Concept	Description
Agent	The main controller class wrapping your system instructions, model parameters, and capabilities.
result_type	The argument that converts your unstructured LLM into a strict, self-healing parser via Pydantic validation.
`run()` / `run_sync()`	Execution entry points that manage the runtime loop, tracking usage metadata and state.
@agent.tool_plain	A decorator that equips an agent with tools using clear type-hints and docstring descriptions.

Continue tutorial for multi-agent workflows

This continuation of the PydanticAI tutorial focuses on Multi-Agent Workflows.

In production environments, complex tasks cannot be solved reliably by a single monolithic prompt. Instead, we break them down into multi-agent topologies. PydanticAI fundamentally natively supports two primary multi-agent patterns out of the box:

Agent Delegation (Orchestration): A parent agent uses other agents as local tools, maintaining ultimate control over the execution loop.
Programmatic Agent Hand-off (Chaining): The application code orchestrates sequential execution, explicitly passing a validated Pydantic dataset out of one agent and injecting it directly into the next.

Pattern 1: Agent Delegation with Aggregated Token Tracking

In this workflow, the parent agent acts as a supervisor. To monitor total API usage costs accurately across the entire operation, you pass the parent agent's usage tracker (ctx.usage) straight down into the sub-agent’s .run() call.

Here is how to implement a nested research and summarization system:

import asyncio
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext

# Define the final data structure the user cares about
class ExecutiveSummary(BaseModel):
    core_theme: str = Field(description="The primary topic analyzed.")
    key_findings: list[str] = Field(description="Bullet points detailing critical facts extracted.")
    raw_token_count: int = Field(description="Total model request calls made during this run.")

# 1. Initialize the specialized Deep Research sub-agent
research_agent = Agent(
    'google:gemini-2.5-flash',
    instructions="You are a data-mining specialist. Extract hard, objective facts about the target topic."
)

# 2. Initialize the Supervisor agent
supervisor_agent = Agent(
    'google:gemini-2.5-flash',
    result_type=ExecutiveSummary,
    instructions="You are an editor. Coordinate research via tools and synthesize the findings into our structured schema."
)

# 3. Expose the sub-agent to the supervisor as a tool
@supervisor_agent.tool
async def delegate_deep_research(ctx: RunContext[None], topic: str) -> str:
    """Use this tool to launch deep, atomic factual research on a specific sub-topic."""
    print(f"🔄 [Supervisor]: Delegating sub-task '{topic}' to Research Agent...")
    
    # CRITICAL: We pass usage=ctx.usage so sub-agent tokens are added to the parent's metrics
    sub_agent_result = await research_agent.run(
        f"Provide an objective fact-sheet about: {topic}", 
        usage=ctx.usage
    )
    return sub_agent_result.data

async def main():
    prompt = "Research the impact of Solid-State Batteries on electric vehicle range."
    
    # Run the top-level supervisor
    result = await supervisor_agent.run(prompt)
    
    summary: ExecutiveSummary = result.data
    print("\n✨ [Final Structured Result]:")
    print(f"Theme: {summary.core_theme}")
    print(f"Findings: {summary.key_findings}")
    
    # Inspect the combined usage metrics accumulated via ctx.usage
    print(f"\n📈 [Combined Usage Stats]:")
    print(f"Total Requests Made: {result.usage().request_count}")
    print(f"Total Tokens Consumed: {result.usage().total_tokens}")

if __name__ == "__main__":
    asyncio.run(main())

Pattern 2: Programmatic Hand-off (Linear Pipeline Chaining)

When you need rigid control architecture—such as an automated code assembly pipeline where a Developer Agent completely finishes writing code before passing it off to a QA Auditor Agent—you don't want agents talking behind the scenes. You want your Python runtime processing the step boundaries.

Because PydanticAI yields strongly typed class objects, this hand-off pattern is fully compile-time safe.

import asyncio
from pydantic import BaseModel, Field
from pydantic_ai import Agent

# Step 1 Schema: Output from the Architect
class CodeBlueprint(BaseModel):
    function_name: str
    raw_code: str = Field(description="The functional Python script implementation block.")

# Step 2 Schema: Output from the Auditor
class AuditReport(BaseModel):
    is_approved: bool
    vulnerabilities_found: list[str] = Field(description="List of bugs or edge cases found.")
    hardened_code: str = Field(description="The corrected, optimized final code block.")

# Define the two isolated pipeline agents
architect_agent = Agent(
    'google:gemini-2.5-flash',
    result_type=CodeBlueprint,
    instructions="Write a specialized Python function based on the user requirement. Output only valid Python code inside the structural schema."
)

auditor_agent = Agent(
    'google:gemini-2.5-flash',
    result_type=AuditReport,
    instructions="You are a senior code auditor. Inspect the incoming code blueprint for bugs, type errors, or security flaws."
)

async def run_pipeline(user_request: str):
    print(f"🚀 Starting pipeline for request: '{user_request}'\n")
    
    # --- PHASE 1: Architect execution ---
    architect_result = await architect_agent.run(user_request)
    blueprint: CodeBlueprint = architect_result.data
    print("✅ [Phase 1 Complete] Architect built code blueprint.")
    
    # --- PHASE 2: Programmatic Hand-off ---
    # We construct a message using the explicit, validated outputs from Phase 1
    auditor_prompt = (
        f"Review the function '{blueprint.function_name}'. "
        f"Here is the code body to inspect:\n{blueprint.raw_code}"
    )
    
    print("🔐 [Phase 2 Launching] Passing data to Auditor...")
    auditor_result = await auditor_agent.run(auditor_prompt)
    report: AuditReport = auditor_result.data
    
    # --- FINAL SYNTHESIS ---
    print("\n🏁 [Pipeline Complete] Final Audit Status:")
    print(f"Approved: {report.is_approved}")
    print(f"Issues Discovered: {report.vulnerabilities_found}")
    print(f"Final Executable Code:\n{report.hardened_code}")

if __name__ == "__main__":
    asyncio.run(run_pipeline("Write a function that parses a string date into a datetime object safely."))

3. Protecting Against Loop Runaways

When multi-agent architectures call each other inside nested tool arrays, it can be easy to introduce infinite logic loops if an agent misinterprets an error. PydanticAI provides native execution guardrails to halt runaway execution loops via UsageLimits.

You can pass these constraints directly to the high-level runtime configuration:

from pydantic_ai.usage import UsageLimits

# Enforce strict boundaries on the operational capability per interaction loop
limits = UsageLimits(
    request_limit=5,      # Hard cap on maximum sequential LLM API calls allowed
    total_tokens_limit=50000  # Immediately stop if token counts exceed this envelope
)

# Apply limits directly at runtime
result = await supervisor_agent.run(
    "Coordinate the complex research task", 
    usage_limits=limits
)

If the agents bounce back and forth more than 5 times trying to correct an analytical contradiction, PydanticAI will terminate the execution with a UsageLimitExceeded exception, saving your API key from unexpected spikes.

From <https://gemini.google.com/app/> Google Gemini (3.5 Flash)

Search This Blog

Learning from ChatGPT