Guide · Transport
MCP server Streamable HTTP transport
Streamable HTTP is the current standard for remote MCP servers, introduced in the MCP spec revision of March 2025. It replaces the dual-endpoint SSE transport with a single POST /mcp endpoint that can return either an inline JSON response or an SSE stream in the response body — whichever suits the tool's needs. This guide covers StreamableHTTPServerTransport setup, session management via the Mcp-Session-Id header, stateless mode for serverless deployments, and the differences from the older SSE transport.
TL;DR
Mount a single POST /mcp handler on your Express app. Create one StreamableHTTPServerTransport per session (or use stateless mode for serverless). The transport reads the Mcp-Session-Id request header to route messages to the right session. Simple tool calls get an inline JSON response; tools that emit progress notifications get an SSE stream in the response body. For stateless deployments (Lambda, Cloudflare Workers), set sessionIdGenerator: undefined — each POST is self-contained with no session state.
Why Streamable HTTP replaces the SSE transport
The SSE transport had three deployment friction points: (1) it required two separate HTTP endpoints with coordinated routing; (2) persistent SSE connections break serverless execution models; (3) load balancers needed sticky session configuration so the GET and POST requests for the same session always hit the same server instance. Streamable HTTP addresses all three:
| Concern | SSE transport | Streamable HTTP |
|---|---|---|
| Endpoints | GET /sse + POST /messages (two endpoints, coordinated routing) | POST /mcp (single endpoint) |
| Serverless compatibility | Incompatible — persistent SSE connection required | Stateless mode: each POST is self-contained |
| Load balancer setup | Sticky sessions required for session affinity | Stateless: no affinity needed; stateful: session ID routes correctly |
| Streaming | SSE stream on GET connection | SSE stream embedded in POST response body (when needed) |
| Simple request/response | Response arrives via separate SSE event | Inline JSON in the POST response body |
Express + StreamableHTTPServerTransport setup
The stateful version — one transport per client session, appropriate for long-running agents with session-scoped state:
import express from 'express';
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
import { randomUUID } from 'node:crypto';
const app = express();
app.use(express.json());
// Active sessions: sessionId → transport
const sessions = new Map<string, StreamableHTTPServerTransport>();
function createMcpServer(): McpServer {
const server = new McpServer({ name: 'my-server', version: '1.0.0' });
// Register tools here...
return server;
}
app.post('/mcp', async (req, res) => {
const sessionId = req.headers['mcp-session-id'] as string | undefined;
if (sessionId) {
// Existing session — route to existing transport
const transport = sessions.get(sessionId);
if (!transport) {
res.status(404).json({ error: 'Session not found or expired' });
return;
}
await transport.handleRequest(req, res, req.body);
return;
}
// New session — initialize request with no session header
const newSessionId = randomUUID();
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: () => newSessionId,
onsessioninitialized: (id) => {
sessions.set(id, transport);
},
});
transport.onclose = () => {
sessions.delete(newSessionId);
};
const server = createMcpServer();
await server.connect(transport);
await transport.handleRequest(req, res, req.body);
});
// Explicit session termination
app.delete('/mcp', async (req, res) => {
const sessionId = req.headers['mcp-session-id'] as string;
const transport = sessions.get(sessionId);
if (transport) {
await transport.close();
sessions.delete(sessionId);
}
res.status(200).end();
});
app.listen(3000);
The transport sends the session ID back to the client in the Mcp-Session-Id response header of the initialize response. The client stores this ID and includes it in all subsequent requests.
Stateless mode for serverless deployments
Serverless functions (Lambda, Cloudflare Workers, Vercel Functions) have no persistent memory between requests. Streamable HTTP supports a stateless mode where each POST is fully self-contained — no session state, no transport map, no cleanup. Set sessionIdGenerator: undefined:
// serverless-handler.ts — one McpServer per request, stateless
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
export async function handler(req: Request): Promise<Response> {
const server = new McpServer({ name: 'my-server', version: '1.0.0' });
// Register tools...
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: undefined, // stateless — no session header assigned
});
// handleRequest returns a Response in the serverless adapter
return transport.handleRequest(req, server);
}
In stateless mode: the server creates a fresh McpServer instance per request; there is no session ID in the response headers; each POST must be self-contained (client sends the full request context every time). This is appropriate for stateless tools (web search, API wrappers, calculations) but not for tools that accumulate state across calls in the same session.
Stateless mode also cannot deliver progress notifications, because there is no persistent channel to push events on — the response body either contains an inline JSON result or an SSE stream that closes with the final response. For stateless tools with long computation times, SSE streaming still works within the single request's response body.
Response modes: inline JSON vs SSE stream
The transport automatically selects the response mode based on what the server sends during request handling:
- Inline JSON — if the tool handler returns a result immediately with no progress notifications, the transport sends
Content-Type: application/jsonwith the result as the response body. This is the common case for fast, synchronous tools. - SSE stream — if the server sends any notifications (progress updates, resource change events) before the final result, the transport switches to
Content-Type: text/event-streamand streams events. The client reads the stream until the final result event arrives.
The client must declare it accepts both formats via the Accept header: Accept: application/json, text/event-stream. The transport uses the inline format by default and upgrades to SSE only when needed.
// Tool that triggers SSE streaming via progress notifications
server.tool(
'process_document',
'Process a large document with progress updates',
{ url: z.string().url() },
async ({ url }, { sendNotification }) => {
await sendNotification({
method: 'notifications/progress',
params: { progressToken: 1, progress: 0, total: 100 },
});
const result = await processDocument(url, async (pct) => {
await sendNotification({
method: 'notifications/progress',
params: { progressToken: 1, progress: pct, total: 100 },
});
});
return { content: [{ type: 'text', text: result }] };
// Transport automatically selected SSE mode because of the notifications
}
);
Session management details
The session lifecycle in stateful mode:
- Client sends
POST /mcpwith aninitializerequest body and noMcp-Session-Idheader. - Server creates a new transport and session ID, connects the McpServer, and calls
transport.handleRequest(). - The transport sends the
initializedresponse withMcp-Session-Id: <id>in the response headers. - All subsequent client requests include
Mcp-Session-Id: <id>. The server looks up the transport by ID and callstransport.handleRequest(). - Client sends
DELETE /mcpwith the session ID header to terminate. Or the transport'sonclosefires when a streaming response ends.
Session expiry: unlike SSE (where a dropped connection is immediately visible), stateful Streamable HTTP sessions can be "orphaned" if the client drops without sending DELETE. Add a TTL-based cleanup using a timestamp updated on each request:
const sessionLastSeen = new Map<string, number>();
// Update on each request
sessionLastSeen.set(sessionId, Date.now());
// Evict sessions idle for over 30 minutes
setInterval(() => {
const cutoff = Date.now() - 30 * 60 * 1000;
for (const [id, ts] of sessionLastSeen) {
if (ts < cutoff) {
const transport = sessions.get(id);
transport?.close();
sessions.delete(id);
sessionLastSeen.delete(id);
}
}
}, 60_000);
Load balancer configuration
For stateful Streamable HTTP behind a load balancer, all requests for a session must reach the same server instance — the session state is in-memory. Configure sticky sessions using the Mcp-Session-Id header or a cookie:
| Load balancer | Sticky session config |
|---|---|
| nginx | proxy_cache_use_stale not relevant; use ip_hash directive or hash $http_mcp_session_id consistent; |
| AWS ALB | Target group attribute: stickiness.enabled=true; use a custom CORS-safe cookie set by the first response |
| Caddy | Use lb_policy header { name Mcp-Session-Id } directive (Caddy 2.8+) |
| HAProxy | balance hdr(Mcp-Session-Id) in the backend block |
For stateless mode, sticky sessions are unnecessary — route freely across instances. This is the main scaling advantage of stateless mode.
If you need stateful sessions without sticky routing (for true horizontal scaling), move session state to an external store: serialize the transport's session context to Redis and restore it on each request. The MCP SDK does not support this natively — you would need to implement serialization at the application layer.
Security setup
Streamable HTTP servers are HTTP APIs — apply the same hardening you would to any REST API:
- Authentication — validate an
Authorization: Bearer <token>header before creating a new session. Reject unauthenticatedinitializerequests with401. See MCP server authentication and OAuth 2.1 integration. - Rate limiting — apply rate limiting per IP and per session ID to prevent tool-call flooding.
- Input validation — validate all tool arguments with Zod schemas before processing. The transport delivers raw JSON-RPC bodies — validation is your responsibility.
- CORS — if browser clients connect, set
Access-Control-Allow-Originto specific origins, not*, especially for authenticated servers. - TLS — serve over HTTPS in production. AliveMCP probes only HTTPS endpoints for HTTP-based transports.
Migrating from SSE transport
If you have an existing SSE transport server, migration to Streamable HTTP is straightforward because the McpServer core doesn't change — only the transport layer:
- Replace
SSEServerTransportimports withStreamableHTTPServerTransport. - Replace the dual-endpoint (GET /sse + POST /messages) handlers with a single POST /mcp handler.
- Update the session map logic to use the
Mcp-Session-Idheader instead of the session ID query parameter. - Add a DELETE /mcp handler for explicit session termination.
- Run both transports simultaneously during the transition: mount both sets of handlers, let clients negotiate via their Accept headers.
Client libraries updated to MCP SDK 1.1.0+ automatically prefer Streamable HTTP when the server supports it. Older clients fall back to SSE.
AliveMCP monitoring for Streamable HTTP servers
AliveMCP probes Streamable HTTP servers by sending an initialize request to POST /mcp and validating the response. The probe tests both the inline JSON path (no session, stateless-compatible) and the session header path (stateful sessions). Monitored metrics include:
- Initialize handshake latency (p50/p95/p99)
- Session creation success rate
- TLS certificate validity and expiry date
- HTTP response codes (4xx and 5xx trigger alerts)
Once your Streamable HTTP server is deployed, add it to AliveMCP to get 60-second uptime probes, a public status page, and downtime alerts via webhook or Slack.
Related questions
Do I need to handle both GET and POST on /mcp?
No — Streamable HTTP is POST-only for all client-to-server communication. The GET /mcp endpoint is not used by the protocol. If you mount a GET handler on /mcp for a health check UI, make sure it doesn't interfere with POST routing. A simple health check is better placed at GET /health — a separate path that doesn't conflict with the MCP endpoint.
Can one server support both SSE and Streamable HTTP simultaneously?
Yes — mount both sets of endpoints in the same Express app. SSE clients use GET /sse + POST /messages; Streamable HTTP clients use POST /mcp. They share the same McpServer tool definitions but use independent transport instances. This is the recommended approach during migration. Both transports connect to the same McpServer (or to different instances created by the same factory) — the tool logic is identical regardless of transport.
How does streaming work without a persistent connection?
The streaming happens within a single HTTP POST request's response body. The client sends the request; the server starts streaming SSE events as the response body (setting Content-Type: text/event-stream). When the final result is ready, the server sends a final event and closes the response. The connection is open for the duration of the streaming response only — not permanently. For the next tool call, the client opens a new POST request.
What SDK version introduced StreamableHTTPServerTransport?
StreamableHTTPServerTransport was introduced in @modelcontextprotocol/sdk version 1.1.0, corresponding to the MCP spec revision of March 26, 2025. If your package.json has an older SDK version, run npm install @modelcontextprotocol/sdk@latest to get it. The class is exported from @modelcontextprotocol/sdk/server/streamableHttp.js.
Is Streamable HTTP compatible with edge runtimes like Cloudflare Workers?
In stateless mode, yes — each POST is self-contained and uses only Web Standard APIs (Request, Response). The SDK's Node.js adapter wraps the Web API types, so you may need a Cloudflare-specific adapter or use the SDK's fetch-based handler variant. In stateful mode, no — edge functions have no shared memory between invocations, so the session map pattern doesn't work without an external session store like Cloudflare KV.
Further reading
- MCP server stdio transport — local process communication guide
- MCP server SSE transport — HTTP+SSE remote server setup
- MCP server transport comparison — stdio vs SSE vs Streamable HTTP
- MCP server JSON-RPC 2.0 — protocol messages and lifecycle
- MCP server authentication — OAuth 2.1 and API key validation
- MCP server streaming — progress notifications in long tool calls
- MCP server load balancing — horizontal scaling patterns
- MCP server Redis — shared state for multi-instance deployments
- AliveMCP — uptime monitoring for Streamable HTTP MCP servers