Guide · AI Platform Integration

MCP servers with OpenAI Agents SDK

The OpenAI Agents SDK (the openai-agents Python package) ships with native MCP support. You attach an MCP server to an Agent using MCPServerHTTP or MCPServerStdio, and the SDK handles the full protocol lifecycle: initialize handshake, tools/list discovery, schema conversion to OpenAI function-call format, dispatching tools/call on each LLM tool request, and injecting results back into the conversation. The SDK's orchestration layer — Handoffs, Guardrails, and Tracing — all compose cleanly with MCP tools, so MCP-backed tools participate in the same safety, routing, and observability pipeline as native Python tools. The one gap the SDK doesn't close: it cannot detect when the MCP server itself goes down between calls. That's where external monitoring becomes essential.

TL;DR

Add MCPServerHTTP(url="...") to Agent(mcp_servers=[...]). For a long-running service, wrap your runner with async with agent.run_mcp_servers(): to keep the connection open. The SDK calls tools/list once at connection open and reuses the tool list for the connection lifetime — restart the connection when the MCP server adds or removes tools. Monitor MCP servers with AliveMCP: the SDK absorbs connection failures silently (the LLM sees no available tools), so external probing is the only reliable way to detect server downtime before it kills agent runs.

OpenAI Agents SDK MCP architecture

The SDK treats MCP servers as a source of tools at the agent level, separate from native Python tools. When a connection opens, the SDK calls tools/list and converts each MCP tool's inputSchema into an OpenAI FunctionDefinition. These are merged with any native Python tools and sent to the model in a single tools array.

Step	SDK action	MCP protocol call
Connection open	Establish transport, send client info	`initialize`
Tool discovery	Fetch available tools, convert to OpenAI format	`tools/list`
LLM tool call	Parse tool_call, dispatch to server	`tools/call`
Result injection	Inject tool output as `tool` role message	— (client-side)
Connection close	Terminate transport	— (transport close)

By default, the SDK opens an MCP connection at the start of each Runner.run() call and closes it when the run completes. For a service handling many requests, this means a full MCP handshake per request — use persistent connections to eliminate that overhead.

Basic setup with MCPServerHTTP

import asyncio
from openai_agents import Agent, Runner
from openai_agents.mcp import MCPServerHTTP

# Attach an HTTP/SSE MCP server to an agent
research_agent = Agent(
    name="ResearchAgent",
    model="gpt-4o",
    instructions="You are a research assistant. Use the search and fetch tools to answer questions thoroughly.",
    mcp_servers=[
        MCPServerHTTP(
            url="https://search.internal/mcp",
            headers={"Authorization": "Bearer sk-..."},
            timeout=30,
        ),
    ],
)

async def main():
    result = await Runner.run(
        research_agent,
        "What are the most common MCP server failure modes in production?"
    )
    print(result.final_output)

asyncio.run(main())

The MCPServerHTTP class supports both HTTP streaming (SSE transport) and streamable HTTP. The SDK negotiates the transport version during initialize. For MCP servers using stdio transport (local processes), use MCPServerStdio instead.

Stdio MCP servers for local tools

from openai_agents.mcp import MCPServerStdio

code_agent = Agent(
    name="CodeAgent",
    model="gpt-4o",
    instructions="You are a code execution agent. Write and run Python and JavaScript code to solve problems.",
    mcp_servers=[
        MCPServerStdio(
            command="npx",
            args=["-y", "@acme/code-runner-mcp"],
            env={"NODE_ENV": "production"},
        ),
    ],
)

# The SDK spawns the subprocess at Runner.run() and terminates it on completion

Each MCPServerStdio instance spawns a subprocess for the duration of the agent run. If you run multiple agents with the same stdio MCP server config, each agent run creates its own subprocess instance — there's no subprocess sharing. Use HTTP/SSE MCP servers in production to avoid subprocess management overhead and to share a single server instance across multiple agent runs.

Persistent MCP connections for services

In a FastAPI or ASGI service, re-opening the MCP connection on every request adds 50–300 ms of handshake latency. Use agent.run_mcp_servers() to open the connection once at startup and reuse it across all requests:

from contextlib import asynccontextmanager
from fastapi import FastAPI
from openai_agents import Agent, Runner
from openai_agents.mcp import MCPServerHTTP

search_agent = Agent(
    name="SearchAgent",
    model="gpt-4o",
    instructions="Answer questions using the search tools.",
    mcp_servers=[
        MCPServerHTTP(url="https://search.internal/mcp"),
    ],
)

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Open MCP connection once at service startup
    async with search_agent.run_mcp_servers():
        yield
    # Connection closes at shutdown

app = FastAPI(lifespan=lifespan)

@app.post("/ask")
async def ask(question: str):
    result = await Runner.run(search_agent, question)
    return {"answer": result.final_output}

The tool list is fetched once when run_mcp_servers() opens the connection. If the MCP server adds or removes tools while the connection is live, the agent won't see those changes until the connection is closed and reopened. Build MCP servers so that tool additions are backward-compatible (add tools, don't remove them) to avoid requiring service restarts.

Multi-agent handoffs with MCP tools

The OpenAI Agents SDK's Handoffs feature routes the conversation to a specialist agent based on the current context. MCP tools from the delegating agent are not automatically passed to the target agent — each agent carries its own MCP server list:

from openai_agents import Agent, Runner, handoff
from openai_agents.mcp import MCPServerHTTP

# Specialist agent with its own MCP tools
data_agent = Agent(
    name="DataAgent",
    model="gpt-4o",
    instructions="Query databases and return structured data. Summarize findings clearly.",
    mcp_servers=[
        MCPServerHTTP(url="https://analytics.internal/mcp"),
    ],
)

# Triage agent — no MCP tools, but can hand off to data_agent
triage_agent = Agent(
    name="TriageAgent",
    model="gpt-4o-mini",
    instructions="Understand the user's question. If it requires data analysis, hand off to the DataAgent.",
    handoffs=[handoff(data_agent)],
)

async def handle_request(question: str) -> str:
    result = await Runner.run(triage_agent, question)
    return result.final_output

When triage_agent hands off to data_agent, the SDK opens a new MCP connection for data_agent's MCPServerHTTP instance. If you're using persistent connections via run_mcp_servers(), open connections for all agents in your handoff graph at startup — not just the entry-point agent.

Guardrails and MCP tool results

OpenAI Agents SDK Guardrails run as tripwires on agent input and output. They execute in parallel with the agent run and can trigger an early termination if a safety condition is met. MCP tool results flow through the agent's message history and are therefore visible to output Guardrails:

from openai_agents import Agent, Runner, GuardrailFunctionOutput, output_guardrail
from openai_agents.mcp import MCPServerHTTP
from pydantic import BaseModel

class SafetyCheck(BaseModel):
    is_safe: bool
    reason: str

# Guardrail that checks agent output (including tool-informed responses) for safety
@output_guardrail
async def content_safety_guardrail(ctx, agent, output) -> GuardrailFunctionOutput:
    # Run a fast safety check on the final output
    check = await fast_safety_model.run(output.final_output)
    return GuardrailFunctionOutput(
        output_info=check.data,
        tripwire_triggered=not check.data.is_safe,
    )

monitored_agent = Agent(
    name="MonitoredAgent",
    model="gpt-4o",
    instructions="Answer questions using available tools.",
    mcp_servers=[MCPServerHTTP(url="https://tools.internal/mcp")],
    output_guardrails=[content_safety_guardrail],
)

MCP tool call arguments (what the LLM sends to the tool) are visible to tracing but not to Guardrails by default. If your MCP tools can receive sensitive instructions through LLM-generated arguments, add input Guardrails that inspect the agent's input before tool calls execute. The SDK does not currently provide a per-tool-call hook for Guardrails.

Tracing MCP tool calls

The SDK's built-in tracing captures the full agent run timeline, including which MCP tools were called and with what arguments. Traces are sent to the OpenAI dashboard by default, or can be routed to external systems via a custom trace exporter:

from openai_agents import Agent, Runner, add_trace_processor
from openai_agents.tracing import TracingProcessor

class MCPAuditProcessor(TracingProcessor):
    """Log every MCP tool call for audit and debugging."""
    def on_span_end(self, span):
        if span.span_data.type == "function":
            # span.span_data captures: tool name, input, output, duration
            print(f"MCP tool: {span.span_data.name}")
            print(f"  Input:  {span.span_data.input}")
            print(f"  Output: {span.span_data.output[:200]}")
            print(f"  Time:   {span.ended_at - span.started_at:.3f}s")

add_trace_processor(MCPAuditProcessor())

# All subsequent Runner.run() calls will emit MCP tool call spans

MCP tool call duration in traces tells you the response-time profile of each server. If a particular MCP tool consistently takes 2+ seconds when others complete in 200 ms, that server is a latency outlier — worth investigating with distributed tracing inside the MCP server itself. AliveMCP's per-server response-time history complements SDK-level traces by providing an external measurement of server health over time.

Monitoring MCP servers in OpenAI agent pipelines

The SDK's error behavior when an MCP server is unavailable depends on when the failure occurs. If the server is down when run_mcp_servers() opens the connection, the SDK raises a connection error immediately — visible and catchable. If the server goes down while the persistent connection is live (between agent runs), the SDK's next tools/call attempt fails with a transport error mid-run — the agent sees no tool results and may hallucinate answers or loop.

Neither failure mode is detectable from inside the SDK without external observability. AliveMCP probes the MCP server's initialize endpoint every 60 seconds and alerts on the first failure. In a persistent-connection setup, this means you know the server is down within a minute — before most agent runs would detect the failure themselves. Set AliveMCP alerts to route to the same on-call channel as your OpenAI usage alerts: when both fire together, it's the MCP server; when only the OpenAI alert fires, it's the model or the application.

Frequently asked questions

Does OpenAI Agents SDK support MCP resources and prompts, or only tools?

As of mid-2026, the SDK's MCP integration focuses on tools (tools/list and tools/call). MCP resources and prompts are not surfaced through the standard Agent API — you'd need to call those protocol methods directly via the underlying transport client. Tools are the primary integration surface because they map cleanly onto OpenAI's function-calling API. Resources and prompts can be added to the conversation as context manually before calling Runner.run().

Can I use multiple MCP servers with a single OpenAI agent?

Yes. Pass a list to mcp_servers: Agent(mcp_servers=[MCPServerHTTP(url="..."), MCPServerHTTP(url="...")]). The SDK merges the tool lists from all servers into a single tools array for the model. If two MCP servers define tools with the same name, the SDK uses the order of the list to break ties (first definition wins). Prefix tool names in your MCP servers to prevent collisions across servers: search_web vs db_search.

How does the SDK handle MCP tool errors — isError:true in the tool response?

When an MCP server returns a tools/call response with isError: true, the SDK injects the error content into the conversation as a tool result and lets the LLM decide what to do next. The LLM typically retries with different arguments, falls back to a different tool, or reports the failure in its final output. The SDK does not automatically retry tools/call on isError — retry logic is the LLM's responsibility within the agent loop.

What's the difference between OpenAI Agents SDK MCP support and using the MCP Python SDK directly?

The OpenAI Agents SDK handles the full agent loop: LLM inference, tool dispatch, result injection, Handoffs, and Guardrails. It uses MCP as a tool source. The MCP Python SDK (mcp package) is a protocol library — it handles the MCP wire protocol but has no concept of an agent loop, LLM inference, or orchestration. Use the OpenAI Agents SDK if you're building OpenAI-model agents and want the orchestration layer managed for you. Use the MCP Python SDK directly if you're building an MCP server, or if you need to integrate MCP tools into a custom agent loop.

Can I connect an OpenAI agent to an MCP server that requires OAuth or API keys?

Yes. For HTTP/SSE MCP servers, pass credentials in MCPServerHTTP(headers={"Authorization": "Bearer ..."}). For MCP servers using the MCP Authorization spec (OAuth 2.1 with PKCE), the SDK does not currently handle the OAuth flow automatically — you need to exchange tokens outside the SDK and pass the access token in headers. For API key auth, headers are the correct approach. Avoid putting secrets in the MCP server URL itself, as URLs are logged in traces.