Transports guide · 2026-06-06 · Production MCP servers

MCP Server Transports Guide: Choosing Between stdio, SSE, and Streamable HTTP

MCP server transport selection is unusual because the three options do not overlap in capability — each has hard constraints that rule it out for entire categories of deployment. stdio cannot serve a public API. SSE cannot be deployed stateless to a serverless runtime. Streamable HTTP requires SDK 1.1.0+ and clients that understand the 2025-03-26 spec revision. Below all three runs JSON-RPC 2.0 — the same wire protocol in every case, so the McpServer logic you write works identically regardless of transport. The decision is mechanical once you know your deployment context: one key question — who connects and from where? — resolves the transport choice for most servers. This guide covers how each transport works at the wire level, its hard limits, when it is the right choice, and how to migrate when your requirements outgrow what you started with.

TL;DR

One question decides the transport. If the only client is the developer who wrote the server and it runs on their machine: stdio. If the server must be reachable by multiple clients over a network — now or at any future point — use Streamable HTTP. Use SSE only if you need to support legacy clients that do not yet implement the Streamable HTTP spec.
stdio is local-only, one host at a time, not externally monitorable. Any console.log() to stdout corrupts the message stream and breaks every tool call. Redirect all logging to process.stderr or a file. stdio servers cannot be registered in public MCP directories — they have no URL.
SSE requires two coordinated endpoints and session affinity at the load balancer. GET /sse opens a long-lived SSE connection; POST /messages?sessionId=… receives client requests. The session ID is passed in the first SSE event. Keep-alive comments every 15–30 seconds prevent proxy idle-timeout disconnections. SSE is incompatible with serverless.
Streamable HTTP uses a single POST endpoint and supports stateless mode for serverless. POST /mcp handles all client-to-server traffic. The response is either inline JSON (simple request/response) or an SSE stream in the response body (when the tool emits progress notifications) — selected automatically, no configuration. Stateless mode (sessionIdGenerator: undefined) makes each POST self-contained and works on Lambda, Cloudflare Workers, and Vercel.
The JSON-RPC protocol runs identically over all three transports. All MCP communication is JSON-RPC 2.0: requests have an id field and expect a matching response; notifications have no id and expect no response. Every session starts with a three-message initialize handshake. The two-tier error model — isError: true in the result (LLM-recoverable) versus a JSON-RPC error field (protocol-level, LLM usually cannot recover) — applies regardless of which transport carries the message.

The protocol underneath: JSON-RPC 2.0

All three transports carry the same wire format: JSON-RPC 2.0. Understanding the protocol at this level matters less for routine development — the SDK abstracts it — and more for debugging: when something goes wrong, reading raw JSON-RPC messages in the MCP Inspector's protocol log or in a transport-level interceptor reveals the failure faster than any higher-level diagnostic.

There are three message types. A request has an id, a method, and optional params; the server must send a matching response with the same id, containing either a result or an error field — never both. A notification has no id; neither side sends a response. Every MCP session begins with a fixed three-message handshake before any tool calls: the client sends an initialize request, the server responds with its capabilities and protocol version, and the client sends a notifications/initialized notification to confirm it is ready. Tools are then discovered via tools/list and called via tools/call.

The error model has two tiers that are easy to conflate. A tool that runs successfully but produces a failure result — a file not found, a query returning zero rows, a rate limit hit — should return { content: [...], isError: true } inside the result object. The LLM receives the error message as readable content and can retry with adjusted arguments. A tool that throws an unhandled exception causes the SDK to emit a JSON-RPC error response with code -32603; the LLM typically cannot recover from this. The rule is: use isError: true for application failures; let JSON-RPC errors surface only for genuine protocol violations (invalid method, malformed request envelope, unhandled panic). The transport layer is irrelevant to this distinction — the same rule applies on stdio, SSE, and Streamable HTTP.

stdio transport: local process pipes

The stdio transport is the simplest MCP deployment model: the host application spawns the server as a child process and communicates via the process's stdin and stdout pipes using newline-delimited JSON-RPC messages. Claude Desktop, Cursor, Windsurf, and other MCP host applications use this model for locally installed servers configured in claude_desktop_config.json (or the host's equivalent). There is no network, no authentication surface, and no connection management — the IPC channel is the process relationship itself.

The most common failure with stdio servers is stdout contamination. Any output written to stdout — a startup banner, a console.log() debug statement, an unhandled exception printing a stack trace, an npm install warning — is treated by the host as a JSON-RPC message. The host tries to parse it and fails, silently breaking every subsequent tool call in the session. The correct pattern is absolute: redirect all logging to process.stderr or a log file. Use console.error() instead of console.log(), or configure your logger (pino, winston) to write to process.stderr as its destination. Before publishing any stdio server, verify that node server.js 2>/dev/null | cat produces only valid JSON-RPC output.

Claude Desktop injects configuration via claude_desktop_config.json. The env object in the server entry lists environment variables that are injected at spawn time — the host process's own environment is not inherited. This means API keys, database paths, and feature flags must be declared explicitly. tsx server.ts works as the command for uncompiled TypeScript without a build step. Paths for the config file location differ by operating system: ~/Library/Application Support/Claude/claude_desktop_config.json on macOS, %APPDATA%\Claude\claude_desktop_config.json on Windows, ~/.config/claude/claude_desktop_config.json on Linux.

stdio's hard limits are non-negotiable: it is local-only (no network endpoint), supports exactly one host at a time (two processes cannot both write to stdin reliably), has no authentication mechanism (the OS process model is the only access control), and produces no monitorable network endpoint. There is no URL to probe, no HTTP server to health-check, and no way for an external monitor to verify the server is functioning. For testing, replace the stdin/stdout pipes with InMemoryTransport.createLinkedPair() — this creates a linked in-process transport pair that runs the full MCP protocol at microsecond latency with no actual pipes, making unit tests fast and deterministic.

stdio is the right choice when: the server is a personal productivity tool used only by the developer who wrote it; the server needs local filesystem access and should run as the user's own process with their permissions; the server is distributed via npm and installed locally with npx your-server; or the server processes sensitive data that should never leave the local machine. Any requirement that falls outside these bounds — shared team access, public API, uptime monitoring, multi-user sessions — means the server has outgrown stdio.

SSE transport: HTTP + Server-Sent Events

The SSE transport was the first HTTP-based MCP transport and remains appropriate when browser clients or legacy MCP clients that do not yet support Streamable HTTP need to be served. Its architecture uses two coordinated HTTP endpoints: a GET /sse endpoint that the client opens and keeps alive as a long-lived Server-Sent Events connection; and a POST /messages endpoint that the client uses to send requests to the server. The server pushes JSON-RPC responses and notifications as SSE events on the GET connection.

The pairing mechanism between the two endpoints is the first SSE event the server emits — an endpoint event whose data is the POST URL including the session ID as a query parameter. The client reads this event, extracts the POST URL, and uses it for all subsequent requests in the session. This means the session ID is established by the server and communicated to the client over the SSE connection, not negotiated upfront. One SSEServerTransport instance is created per client connection and stored in a Map keyed by session ID; the POST handler looks up the transport by the session ID in the query string and calls transport.handlePostMessage(req, res). The POST response is always HTTP 202 Accepted — the actual tool result arrives as a later SSE event on the GET connection, not synchronously in the POST response body.

Three operational requirements affect production SSE deployments. First, CORS: browser clients require CORS headers on both the GET /sse and POST /messages endpoints. Apply the cors() middleware with a specific origin rather than a wildcard — origin: '*' prevents the browser from sending credentials (cookies, Authorization headers), which breaks any server with per-user authentication. Second, keep-alive: proxies, load balancers, and CDNs apply an idle-connection timeout that will terminate SSE connections after 30–60 seconds of silence. Send a SSE comment (: keep-alive\n\n) every 15–30 seconds to reset the idle timer. Third, session affinity: GET /sse and POST /messages for the same session must reach the same backend instance, because the transport object is in that instance's memory. Configure nginx with ip_hash, AWS ALB with sticky sessions, or Caddy with a consistent-hash load balancer policy.

SSE is incompatible with serverless because the GET /sse connection is a persistent HTTP response that must stay open for the duration of the session — Lambda functions and Cloudflare Workers have execution time limits measured in seconds, not session lifetimes measured in minutes. SSE is also a legacy transport in the sense that the MCP specification's 2025-03-26 revision introduced Streamable HTTP as the modern replacement. New clients prefer Streamable HTTP; however, not all deployed clients have updated yet, and browser extensions specifically use the native EventSource API which is a natural fit for SSE. The recommended transition strategy is to mount both transports simultaneously — SSE handlers on /sse and /messages, Streamable HTTP on /mcp — and announce a deprecation window of four to eight weeks before removing the SSE handlers.

Streamable HTTP transport: the modern single-endpoint model

The Streamable HTTP transport, introduced in the MCP 2025-03-26 specification revision, solves the three main operational problems with SSE: the dual-endpoint architecture (replaced with a single POST /mcp endpoint), the serverless incompatibility (resolved by stateless mode), and the inability to serve simple request/response exchanges without a persistent connection (resolved by inline JSON responses). It requires SDK version 1.1.0+ from @modelcontextprotocol/sdk.

All client-to-server communication goes through a single POST /mcp endpoint. The client sends requests in the POST body; the server responds with either an inline JSON response (a plain application/json body containing the JSON-RPC response object) or an SSE stream in the response body (a text/event-stream response where events carry the result and any notifications). The response mode is selected automatically — the SDK emits inline JSON when no notifications are sent before the result, and switches to SSE mode when sendNotification() or progress updates are called before the result is returned. No configuration is needed; the client declares support for both by sending Accept: application/json, text/event-stream.

Session management uses an HTTP header. On the first POST (the initialize request), the server sends a Mcp-Session-Id response header; the client includes this header in all subsequent requests for the same session. The Express handler checks for the Mcp-Session-Id request header: if absent, it creates a new StreamableHTTPServerTransport instance via the onsessioninitialized callback and adds it to a Map; if present, it routes to the existing transport. Session cleanup uses a setInterval that evicts sessions where lastSeen is older than 30 minutes, preventing the Map from growing without bound for abandoned sessions.

Stateless mode sets sessionIdGenerator: undefined in the StreamableHTTPServerTransport constructor. Each POST becomes a self-contained request-response cycle with no session state carried between calls. This is compatible with Lambda, Cloudflare Workers, and Vercel because there is no persistent connection and no in-memory session to route. The tradeoff is that tools cannot accumulate per-session state — each call arrives without context from previous calls in the same session. Stateless mode is correct for tools that are purely functional (search, format, calculate, fetch-and-return) and for any serverless deployment. Stateful mode is needed for tools that build context across calls or stream progress updates on long-running operations.

Sticky session requirements for stateful Streamable HTTP are the same as for SSE: POST requests for the same session must reach the same backend instance. Configure load balancer routing on the Mcp-Session-Id header value using nginx's hash $http_mcp_session_id consistent, AWS ALB's stickiness cookie, Caddy's lb_policy header Mcp-Session-Id, or HAProxy's balance hdr(Mcp-Session-Id). Stateless mode has no affinity requirement and can use any load balancing policy. See the deployment guide for the complete Caddy and PM2 configuration for Streamable HTTP servers.

The transport decision matrix

The transport selection guide covers the full decision logic. The decision for most servers reduces to one of ten use-case patterns:

Use case	Transport	Reason
Personal productivity tool for your own use	stdio	No ops overhead, no auth surface, runs as your user process
Local filesystem access (read files, run scripts)	stdio	Inherits OS permissions; keeping it local avoids network attack surface
npm-distributed tool, installed with `npx`	stdio	Host spawns `npx your-server`; no server to deploy or maintain
Shared team API, 2–20 developers	Streamable HTTP (stateful)	Multi-client, requires auth, needs a deployed URL, benefits from uptime monitoring
Public SaaS MCP API, multi-tenant	Streamable HTTP (stateful)	Registerable in public directories; scales horizontally with sticky routing
Serverless deployment (Lambda, Cloudflare Workers, Vercel)	Streamable HTTP (stateless mode)	No persistent connection; stateless mode makes each POST self-contained
Browser extension or web app direct connection	SSE	Native `EventSource` API; browsers lack a Streamable HTTP client in most frameworks yet
Legacy MCP client compatibility required	SSE + Streamable HTTP (both mounted)	Old clients use SSE endpoints; new clients use `/mcp`; same `McpServer` instance serves both
LLM agent framework integration	Streamable HTTP	Modern agent runtimes implement the 2025-03-26 spec; stateless mode simplifies orchestration
Development-time tooling under active iteration	stdio	No port to manage; restart via `tsx --watch`; Inspector connects directly

The McpServer core is transport-agnostic — all tool registrations, prompt handlers, and resource definitions are identical across transports. The only difference is the startup entry point: which transport class you instantiate and which Express routes you add. The cleanest pattern for servers that need to support multiple deployment targets is a factory function — createServer() returns a configured McpServer instance, and the entry point file selects the transport based on an environment variable:

const transport = process.env.MCP_TRANSPORT === 'stdio'
  ? new StdioServerTransport()
  : new StreamableHTTPServerTransport({ sessionIdGenerator: () => randomUUID() });
await server.connect(transport);

This pattern also keeps unit tests transport-agnostic — the createServer() factory is tested with InMemoryTransport.createLinkedPair() regardless of which transport the deployed binary uses.

Migrating from SSE to Streamable HTTP

Existing SSE-based servers can migrate to Streamable HTTP incrementally without a flag-day cutover. The core migration is a transport-layer change — the McpServer instance and all tool registrations are unchanged. The five-step process:

Upgrade the SDK. Bump @modelcontextprotocol/sdk to version 1.1.0 or later. StreamableHTTPServerTransport is not available in earlier versions. Run your existing SSE tests to confirm the upgrade does not break anything before adding the new transport.
Mount Streamable HTTP alongside SSE. Add a POST /mcp handler using StreamableHTTPServerTransport to your existing Express app. The SSE handlers at GET /sse and POST /messages stay in place. The same McpServer instance can serve both transports, or you can create a second instance sharing the same tool registrations via the factory pattern.
Test with a Streamable HTTP client. Use the MCP Inspector (select Streamable HTTP in the transport dropdown, provide the /mcp URL) or an SDK client configured with StreamableHTTPClientTransport to verify the new endpoint handles the initialize handshake and all tool calls correctly.
Update the registry listing. If the server is listed in MCP.so, Glama, Smithery, or the Official Registry, update the endpoint URL to the /mcp path and confirm the transport type is set to Streamable HTTP. New clients will prefer the Streamable HTTP endpoint once it appears in the listing.
Remove SSE handlers after the transition window. Announce the deprecation timeline to any known integrators (four to eight weeks is a reasonable window). After the deadline, remove the GET /sse and POST /messages handlers. If you receive error reports from clients that missed the transition, add a temporary redirect from /sse returning HTTP 410 Gone with a message pointing to /mcp.

The migration is low-risk because the same JSON-RPC protocol runs over both transports. A tool that works correctly over SSE will work identically over Streamable HTTP — the only thing that changes is how the message bytes move between client and server.

Transport choice and external monitoring

Transport selection has a direct consequence for whether a server can be externally monitored. This matters because an MCP server that is unreachable to clients — crashed and not restarted, network-partitioned, certificate expired, initialize handler broken — is invisible to in-process health checks, PM2 process monitors, and any tool that relies on the process being alive. The server process can be running and returning HTTP 200 on a health endpoint while every MCP session fails at the initialize handshake — a failure mode that in-process monitors systematically miss.

stdio servers have no URL and cannot be externally monitored. AliveMCP, Datadog, UptimeRobot, and any external probe all require an HTTP endpoint to connect to. A stdio server that stops responding, crashes between spawns, or produces stdout contamination that breaks tool calls is invisible to all external monitors. If your server has grown to the point where uptime matters to users who are not you, migrating to an HTTP transport is a prerequisite for monitoring — not optional.

SSE servers are probeable via a two-step sequence: open GET /sse, read the endpoint event to extract the POST URL, then POST an initialize request and read the JSON-RPC response from the SSE stream. This is more complex than a simple HTTP probe, but it exercises the actual protocol path. AliveMCP probes SSE servers using this sequence, validating the protocolVersion in the initialize response and the tool list against a stored baseline to detect schema drift.

Streamable HTTP servers are the simplest to probe: POST an initialize request to /mcp and read the inline JSON response. No persistent connection, no SSE stream to parse, no session ID handshake to complete first. A single HTTP request exercises the full protocol handshake. AliveMCP probes Streamable HTTP servers via this method every 60 seconds and surfaces the result on a public status page at /status/<server-slug> — visible to the server author, to users checking before they integrate, and to agent platform teams that need supply-chain uptime visibility across dozens of third-party MCP endpoints.

The pattern that works well for teams operating production Streamable HTTP servers is a two-tier monitoring setup: an external probe that verifies the full MCP protocol handshake from outside the deployment (AliveMCP or equivalent), plus in-process performance monitoring that catches degradation before it becomes an outage. The external probe catches deployment-layer failures (process crashed, network unreachable, certificate expired, broken initialize handler); the in-process monitoring catches application-layer failures (heap growth, p99 latency creep, CPU hot paths). Neither covers what the other sees. Structured logging at the transport level — logging every initialize, tools/list, and tools/call with latency and outcome — provides the correlation data needed to triage failures that the external probe detects but the application logs need to explain.

Quick reference: transport comparison

Property	stdio	SSE	Streamable HTTP
Protocol	JSON-RPC over stdin/stdout pipes	JSON-RPC over HTTP + SSE	JSON-RPC over HTTP POST
Endpoints	None (IPC)	GET /sse + POST /messages	POST /mcp
Network access	Local only	Any network	Any network
Simultaneous clients	One	Many (with sticky routing)	Many (stateful) / Unlimited (stateless)
Authentication support	None (OS process model)	Full (HTTP middleware)	Full (HTTP middleware)
Serverless compatible	No	No	Yes (stateless mode)
Load balancer affinity	N/A	Required (session affinity)	Required (stateful) / None (stateless)
External monitoring	Not possible	Via GET /sse → POST sequence	Via POST /mcp initialize
MCP spec status	Current	Legacy (pre-2025-03-26)	Current (2025-03-26+)
SDK version required	Any	Any	1.1.0+

Where transport meets deployment

Transport selection does not exist in isolation — it interacts with every other deployment decision. Authentication is only possible on HTTP transports; stdio servers have no authentication surface. CI/CD pipelines for HTTP-transport servers need a smoke test that connects via the MCP protocol (not just checks HTTP 200), because a broken initialize handler passes all HTTP health checks while failing every user session. Reverse proxy configuration for SSE requires proxy_buffering off and proxy_read_timeout 3600s in nginx — the defaults break SSE delivery. Streamable HTTP requires Caddy or nginx configuration that handles both application/json and text/event-stream response content types on the same endpoint. Docker health checks for HTTP-transport servers should use the SDK's client to run a full initialize + tools/list sequence, not a curl /health HTTP probe, to match the production failure mode that matters.

The single most important architectural decision most MCP server authors delay too long is transport choice. Starting with stdio for speed of development is reasonable — zero configuration, immediate feedback, Inspector works out of the box. But stdio is a local development transport with a hard capability ceiling. Any server that is expected to be used by more than one person, that needs uptime guarantees, that should appear in the public MCP ecosystem, or that should be monitored for reliability will need an HTTP transport. The earlier that migration happens, the less code accumulates on top of a transport that cannot support the server's actual operating requirements.