Guide · Agentic Frameworks

MCP server Pydantic AI integration

Pydantic AI is a Python agentic framework built by the Pydantic team that brings the same type-safety and validation philosophy as Pydantic v2 to LLM agent construction. It has first-class MCP support built in: MCPServerSSE and MCPServerStdio classes handle connection management, and the framework integrates MCP tools alongside native Python tools in the same agent. The type-safety angle adds a production benefit that other frameworks lack: Pydantic AI can validate MCP tool output against a defined schema, auto-retry with validation error feedback when the LLM generates malformed tool arguments, and enforce typed agent return values end-to-end. Monitoring the underlying MCP servers is still essential — Pydantic AI's strict typing means schema mismatches surface as clear errors, but server downtime still looks like a mysterious timeout at the framework layer.

TL;DR

Add MCPServerSSE(url='...') or MCPServerStdio(command='...') to the mcp_servers parameter of Agent(). Use async with agent.run_mcp_servers(): for a persistent connection across multiple agent runs in the same service. Define a Pydantic model as the agent's result_type to get typed, validated output. Set retries=3 so the agent auto-retries on ValidationError from malformed LLM-generated arguments. Monitor MCP servers with AliveMCP: Pydantic AI fails fast on schema errors, but it fails silently on network failures — external monitoring closes that gap.

Pydantic AI's approach to MCP

Unlike LangChain (which uses an adapter library) or AutoGen (which requires manual function registration), Pydantic AI has MCP built into its core abstractions. MCP tools are treated identically to native Python tools from the agent's perspective — the framework discovers them via tools/list, sends arguments via tools/call, and validates results using the same Pydantic machinery as native tools.

Feature	Pydantic AI native tools	MCP tools in Pydantic AI
Tool definition	Python function with type annotations	MCP server `tools/list` response
Schema source	Pydantic/type-annotation inference	MCP tool `inputSchema` (JSON Schema)
Argument validation	Pydantic v2 `TypeAdapter`	Same Pydantic v2 validation on `inputSchema`
Retry on bad args	Auto-retry with `ValidationError` feedback	Same auto-retry mechanism
Result typing	Function return type	Text content from MCP response (string)
Auth injection	`RunContext[Deps]` dependency	Connection-level headers in `MCPServerSSE`

Basic setup with MCPServerSSE

import asyncio
import os
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerSSE

# Define what you expect the agent to return
class ResearchResult(BaseModel):
    summary: str
    key_findings: list[str]
    confidence: float  # 0.0–1.0

# Create the agent with MCP server attached
research_agent = Agent(
    "claude-sonnet-4-6",
    mcp_servers=[
        MCPServerSSE(
            url="https://search.internal/mcp",
            headers={"Authorization": f"Bearer {os.environ['SEARCH_TOKEN']}"},
            timeout=30,
        )
    ],
    result_type=ResearchResult,  # agent output validated as ResearchResult
    system_prompt="You are a research assistant. Use the search tools to find information and return structured findings.",
    retries=3,  # retry on ValidationError (malformed tool args or structured output)
)

async def main():
    result = await research_agent.run("What are the key security vulnerabilities in MCP server implementations?")
    # result.data is typed as ResearchResult — IDE autocomplete works
    print(f"Summary: {result.data.summary}")
    print(f"Confidence: {result.data.confidence:.0%}")
    for finding in result.data.key_findings:
        print(f"  • {finding}")

asyncio.run(main())

The result_type=ResearchResult parameter tells Pydantic AI that the agent's final output must conform to the ResearchResult schema. If the LLM returns malformed JSON or omits required fields, Pydantic AI injects the ValidationError back into the conversation and retries (up to retries times), asking the LLM to fix the output format.

Stdio MCP servers and local packages

For locally installed MCP packages (stdio transport), use MCPServerStdio:

from pydantic_ai.mcp import MCPServerStdio

code_agent = Agent(
    "claude-sonnet-4-6",
    mcp_servers=[
        MCPServerStdio(
            command="npx",
            args=["-y", "@acme/code-runner-mcp"],
            env={"NODE_ENV": "production", "RUNNER_TIMEOUT": "30000"},
        ),
    ],
    system_prompt="You are a code execution agent. Write and run Python and JavaScript code.",
)

Each MCPServerStdio spawns a subprocess when the agent's MCP servers context opens and terminates it when the context closes. The subprocess's stdout/stdin carry the MCP JSON-RPC protocol. Ensure the command is on PATH or use an absolute path — the subprocess inherits the parent process's environment plus any env overrides you specify.

Persistent MCP connections across multiple runs

By default, each agent.run() call opens and closes the MCP connection — suitable for scripts and one-shot invocations. For a service that handles multiple requests, use agent.run_mcp_servers() to keep connections open across runs:

from contextlib import asynccontextmanager
from fastapi import FastAPI

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Start MCP connections once at service startup
    async with research_agent.run_mcp_servers():
        yield
    # Connections close here at shutdown

app = FastAPI(lifespan=lifespan)

@app.post("/research")
async def research(query: str):
    # agent.run() reuses the already-open MCP connection
    result = await research_agent.run(query)
    return {
        "summary": result.data.summary,
        "findings": result.data.key_findings,
        "confidence": result.data.confidence,
    }

The MCP connection remains open for the lifetime of the run_mcp_servers() context. In-flight requests complete normally when the context closes — existing tool calls are not interrupted. This pattern saves the 100–300 ms MCP handshake overhead on every request in a high-throughput service.

RunContext and dependency injection with MCP

Pydantic AI's RunContext[Deps] allows injecting typed dependencies into native Python tools via the deps parameter. With MCP tools, you cannot inject dependencies directly into the tool handler (it runs on the MCP server, not in your Python process). Instead, use connection-level identity propagation:

from dataclasses import dataclass
from pydantic_ai import Agent, RunContext
from pydantic_ai.mcp import MCPServerSSE

@dataclass
class AgentDeps:
    user_id: str
    tenant_id: str
    auth_token: str

# For native Python tools: use RunContext to access deps
async def get_user_profile(ctx: RunContext[AgentDeps]) -> str:
    """Get the current user's profile from the local database."""
    # ctx.deps.user_id is available here
    return await local_db.get_profile(ctx.deps.user_id)

# For MCP tools: inject identity at connection time
def make_agent(user_id: str, tenant_id: str, token: str) -> Agent[AgentDeps, str]:
    return Agent(
        "claude-sonnet-4-6",
        mcp_servers=[
            MCPServerSSE(
                url="https://analytics.internal/mcp",
                headers={
                    "Authorization": f"Bearer {token}",
                    "X-User-Id": user_id,
                    "X-Tenant-Id": tenant_id,
                },
            )
        ],
        tools=[get_user_profile],
        deps_type=AgentDeps,
    )

async def handle_request(user_id: str, tenant_id: str, token: str, query: str):
    agent = make_agent(user_id, tenant_id, token)
    result = await agent.run(query, deps=AgentDeps(user_id, tenant_id, token))
    return result.data

This creates a per-request agent instance with the user's credentials baked into the MCP connection headers. The MCP server reads these headers during initialize and propagates them as session context — so AsyncLocalStorage context propagation in the MCP server sees the correct user identity without accepting it from tool arguments.

Schema validation and automatic retry

One of Pydantic AI's key advantages is automatic retry on ValidationError. When the LLM generates tool arguments that don't match the MCP tool's inputSchema, Pydantic AI catches the validation error, includes the error message in the next conversation turn, and asks the LLM to correct its output:

# In your MCP server — complex schemas degrade LLM accuracy:
# Bad: nested objects cause frequent validation errors
{
  "name": "search",
  "inputSchema": {
    "type": "object",
    "properties": {
      "filters": {
        "type": "object",
        "properties": {
          "date_range": { "type": "object", "properties": { "start": {...}, "end": {...} } }
        }
      }
    }
  }
}

# Good: flat schemas generate valid arguments more reliably
{
  "name": "search",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": { "type": "string", "description": "Search terms" },
      "start_date": { "type": "string", "description": "ISO date, e.g. 2026-01-01" },
      "end_date": { "type": "string", "description": "ISO date, e.g. 2026-12-31" },
      "max_results": { "type": "integer", "minimum": 1, "maximum": 100 }
    },
    "required": ["query"]
  }
}

Pydantic AI retries up to retries times (default 1). Set retries=3 for MCP tools with complex schemas. Beyond 3 retries, the probability of LLM success decays rapidly — better to simplify the schema than to allow unlimited retries.

Testing with TestModel and mock MCP servers

Pydantic AI provides TestModel for deterministic tests that don't make real LLM API calls. For MCP tools in tests, use an MCPServerStdio pointing to a lightweight test server, or mock the MCP protocol directly:

import pytest
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
from pydantic_ai.mcp import MCPServerStdio

@pytest.mark.asyncio
async def test_research_agent_returns_structured_result():
    # TestModel replaces the LLM — deterministic, no API calls
    with research_agent.override(model=TestModel()):
        # TestModel returns a fixed structured response that matches result_type
        result = await research_agent.run("test query")
        assert isinstance(result.data.summary, str)
        assert len(result.data.key_findings) > 0
        assert 0.0 <= result.data.confidence <= 1.0

@pytest.mark.asyncio
async def test_mcp_tool_called_correctly():
    # Use a local in-process MCP test server for integration tests
    async with research_agent.run_mcp_servers():
        result = await research_agent.run(
            "Find papers on MCP security",
            model_settings={"tool_choice": "search_papers"},  # force specific tool
        )
        # Verify the tool was called and returned structured data
        assert result.data is not None

Keep unit tests (using TestModel) and integration tests (using real MCP connections) in separate test suites. Unit tests are fast and run on every commit; integration tests run against real MCP servers in CI and catch protocol-layer issues that TestModel cannot simulate.

Monitoring MCP servers in Pydantic AI applications

Pydantic AI's type strictness means two failure classes look very different at the application layer. Schema mismatches (wrong argument types, missing fields) produce clear ValidationError messages with field paths and expected types. Server downtime produces httpx.ConnectError, asyncio.TimeoutError, or SSE disconnections — these surface as opaque connection errors that give no diagnostic information about whether the MCP server is down, overloaded, or experiencing a network partition.

The monitoring recommendation follows directly from this asymmetry: Pydantic AI handles schema errors well without external help; it handles infrastructure failures poorly. AliveMCP monitors the infrastructure layer — probing the MCP server's initialize endpoint every minute and alerting on the first connection failure. When your Pydantic AI application gets a timeout error, AliveMCP's status tells you immediately whether it's an infrastructure failure (server is down — alert already fired) or an application-level issue (server is up — look at your code). This separation reduces mean time to diagnosis from "check everything" to "check the right layer."

Frequently asked questions

Does Pydantic AI validate MCP tool output or just input arguments?

Pydantic AI validates tool input arguments (the arguments field in tools/call) using the MCP tool's inputSchema. MCP tool output (the content array in the tool response) is passed through as text — Pydantic AI does not validate it against a schema because MCP's tool output is deliberately untyped (free-form content). If you need typed tool output, define a Pydantic model in your MCP server, return the result as a JSON string in the text content, and parse it with model.model_validate_json(content[0]["text"]) in your application code.

Can I mix native Pydantic AI tools and MCP tools in the same agent?

Yes. The Agent constructor accepts both tools=[list_of_python_functions] and mcp_servers=[list_of_mcp_server_configs]. The LLM sees all tools together and selects among them. Prefer native tools for logic that needs access to the Python process context (database connections, in-memory state) and MCP tools for logic that should be deployable and versioned independently (external API calls, compute-heavy operations).

How do I handle MCP server restarts during a long Pydantic AI agent run?

For SSE transport (MCPServerSSE), a server restart drops the SSE connection and Pydantic AI raises a connection error on the next tool call attempt. To handle graceful restarts, configure the MCP server for zero-downtime deployment (new server process accepts connections before old one terminates). For stdio transport (MCPServerStdio), a crash kills the subprocess — Pydantic AI raises a pipe error. Implement automatic subprocess restart by subclassing MCPServerStdio and overriding the error handler to respawn the process.

What is the performance overhead of Pydantic AI's schema validation for MCP tool calls?

Pydantic v2's validation is implemented in Rust and extremely fast — validating a typical MCP tool call argument schema takes microseconds. The overhead is negligible compared to the network round-trip to the MCP server (typically 10–300 ms for HTTP) or the LLM inference time (typically 1–30 seconds). The retries from validation failures are the only significant overhead — each retry is a full LLM call. Keeping MCP tool schemas flat and simple minimizes validation failures and retry cost.

Can I use Pydantic AI with an MCP server built in TypeScript?

Yes. MCP is a protocol, not a language pairing. Your Pydantic AI Python agent connects to any compliant MCP server regardless of implementation language. TypeScript/Node.js MCP servers using the official @modelcontextprotocol/sdk package are fully compatible with Pydantic AI's MCPServerSSE client. This is one of MCP's key advantages over framework-specific tool systems — you can build MCP servers in the language best suited for the task.