Guide · Vercel

MCP server on Vercel

Vercel's serverless and Edge Runtime functions can run MCP server logic, but the platform's stateless, short-lived execution model creates real constraints for MCP. The initialize/tools-list/tool-call sequence requires a session to persist across multiple HTTP requests — on Vercel, there's no in-process session state between function invocations. This guide covers exactly what works (stateless tool handlers, Edge SSE for simple cases), what requires workarounds (session state externalized to Redis or KV), and when to use a persistent-process platform instead.

TL;DR

Stateless MCP tool handlers (no session state, each call is independent) work well on Vercel. Use Next.js App Router API routes or Vercel Edge Functions. SSE streaming works on Edge Runtime. Session-stateful MCP servers require external storage (Vercel KV / Upstash Redis) to reconstruct state on each invocation — workable but adds latency. Function timeout (10s on Hobby, 60s on Pro) limits long tool executions. For MCP servers with complex session state or long-running tools, Railway, Render, or Fly.io are better fits. Use AliveMCP to monitor the public endpoint regardless of platform.

The stateless invocation problem

The MCP protocol is designed around a session: the client sends initialize, gets back the server's capabilities, then calls tools in sequence. In a persistent-process server (Node.js on Railway, for example), all of this happens in one process that lives as long as the session.

Vercel functions are invoked per request and then frozen or killed. Between the initialize call and the first tools/call, the function instance that handled initialize may be gone. In-memory state — the transport object, any session-scoped caches, accumulated context — is lost.

The practical consequence: you must externalize all session state. Every MCP request handler must load state from an external store (Redis, Vercel KV, Postgres) at the start of the request and write it back at the end. For truly stateless tool handlers (no session-scoped state needed), this overhead is zero.

Next.js App Router route handler

For a stateless MCP server (tool handlers that don't need session context beyond what the client sends with each request), a Next.js App Router route handler is a clean fit:

// app/api/mcp/route.ts
import { NextRequest } from 'next/server';
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';

export const runtime = 'nodejs'; // or 'edge' for Edge Runtime

export async function POST(req: NextRequest) {
  const body = await req.json();

  const server = new McpServer({
    name: 'my-mcp-server',
    version: '1.0.0',
  });

  // Register stateless tools
  server.tool('get-weather', { location: z.string() }, async ({ location }) => {
    const data = await fetchWeather(location); // pure function, no session state
    return { content: [{ type: 'text', text: JSON.stringify(data) }] };
  });

  const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined });
  await server.connect(transport);

  const response = await transport.handleRequest(req, new Response());
  return response;
}

The key: sessionIdGenerator: undefined tells the transport not to manage session state. Each POST is stateless — the server is created fresh, tools are registered, the request is handled, and the function returns. This works for tool handlers that are pure functions of their inputs.

SSE streaming on Edge Runtime

Vercel's Edge Runtime supports streaming responses, which is required for SSE. The Streamable HTTP transport (the successor to the SSE transport in newer MCP SDK versions) works with Vercel's streaming infrastructure:

// app/api/mcp/route.ts — streaming version
export const runtime = 'edge';
export const maxDuration = 60; // Pro plan: 60s max

export async function GET(req: NextRequest) {
  const { readable, writable } = new TransformStream();
  const writer = writable.getWriter();
  const encoder = new TextEncoder();

  // Send SSE events for long-running tool operations
  const sendEvent = (data: unknown) => {
    writer.write(encoder.encode(`data: ${JSON.stringify(data)}\n\n`));
  };

  // Handle MCP session in background (Edge-compatible)
  (async () => {
    try {
      sendEvent({ type: 'connected' });
      // ... tool execution with progress events
      writer.close();
    } catch (err) {
      writer.abort(err);
    }
  })();

  return new Response(readable, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive',
    },
  });
}

Edge Runtime has a hard limit on execution time (Vercel Hobby: 30s, Pro: 60s). Long-running tool executions that exceed this limit will be cut off mid-stream. For tools that routinely take more than 30 seconds, use an async pattern: queue the work, return a job ID immediately, and have the client poll for the result.

Session state with Vercel KV

For MCP servers that need session state (accumulated context, tool call history, per-user data), externalize state to Vercel KV (Upstash Redis under the hood):

import { kv } from '@vercel/kv';

interface McpSessionState {
  userId: string;
  toolCallHistory: Array<{ tool: string; result: unknown; ts: number }>;
  contextSummary: string;
}

export async function POST(req: NextRequest) {
  const sessionId = req.headers.get('mcp-session-id') ?? crypto.randomUUID();
  const body = await req.json();

  // Load session state from KV
  const state = await kv.get<McpSessionState>(`session:${sessionId}`) ?? {
    userId: body.params?.clientInfo?.name ?? 'anonymous',
    toolCallHistory: [],
    contextSummary: ''
  };

  // Handle the MCP request using loaded state
  const result = await handleMcpRequest(body, state);

  // Save updated state (30-minute TTL)
  await kv.set(`session:${sessionId}`, state, { ex: 1800 });

  return Response.json(result, {
    headers: { 'mcp-session-id': sessionId }
  });
}

Each invocation adds one KV read + one KV write. At Vercel KV pricing (~$0.0002/request), this is negligible for low-to-moderate traffic. At high request rates, the KV round-trip (5–15ms) becomes a noticeable addition to tool call latency.

Function timeout limits

Vercel function timeouts by plan:

MCP tool executions that involve LLM calls, web scraping, or file processing can easily take 10–60 seconds. Design tool handlers to either:

  1. Complete within the timeout — appropriate for fast, deterministic operations
  2. Use the async job pattern — return a job ID immediately, let the client poll a separate /api/mcp/status/[jobId] endpoint
  3. Stream partial results — use SSE to send incremental results before the timeout

Set maxDuration in your route configuration to the maximum you're willing to pay for, not the platform limit:

export const maxDuration = 30; // Don't let a runaway tool eat 60s

When to choose Vercel vs a persistent server

Use Vercel when:

Use Railway, Render, or Fly.io instead when:

The cost crossover: Vercel Pro is $20/month for 1M function invocations. Railway Starter is $5/month for a continuously-running instance. If your MCP server handles more than ~5,000 sessions per month with 10+ tool calls per session (50,000+ invocations), per-invocation pricing may exceed fixed-instance pricing.

External monitoring for Vercel MCP servers

Vercel provides function-level metrics (invocation count, error rate, p99 duration) in the dashboard. These show whether functions are being invoked and whether they're erroring — but they don't show whether the MCP protocol layer is functioning correctly from a client's perspective.

An MCP client that receives a 500 Internal Server Error on the initialize call sees a broken server; Vercel's metrics show one error event. An MCP client that receives a malformed JSON response sees a protocol failure; Vercel's metrics show a successful invocation. Add your Vercel deployment URL to AliveMCP to catch protocol failures that function metrics don't surface. See AliveMCP for setup.

Related questions

Can I use the MCP stdio transport on Vercel?

No. Stdio transport requires forking a subprocess and piping stdin/stdout. Vercel functions are invoked over HTTP — there's no mechanism to fork a subprocess. HTTP/SSE or the Streamable HTTP transport are the only options for Vercel deployments.

Does Vercel support the newer Streamable HTTP transport?

Yes. The Streamable HTTP transport (introduced in the MCP SDK as a successor to SSE transport) works with both Vercel's Node.js runtime and Edge Runtime. It uses standard HTTP POST requests for messages and streaming responses for server pushes — both work with Vercel's infrastructure. Use StreamableHTTPServerTransport from @modelcontextprotocol/sdk/server/streamableHttp.js.

How do I handle cold starts for MCP clients with tight timeouts?

Vercel cold starts typically take 100–500ms for Node.js (shorter for Edge Runtime). Most MCP clients have timeouts of 10–30 seconds for the initialize request, so cold starts are usually within tolerance. For clients with tighter requirements, warm your function with scheduled pings (a Vercel Cron job that calls your MCP endpoint every 5 minutes), or use Edge Runtime (no cold start for already-deployed functions in the same region as the client).

Further reading