Guide · Edge Runtimes

MCP server edge runtime patterns

Edge runtimes — Cloudflare Workers, Deno Deploy, Vercel Edge Functions, and Bun on a VPS — share a common set of constraints that differ fundamentally from running a Node.js process on a long-lived server. This guide covers the patterns that work across all of them: handling statelessness, externalizing session state to KV or Redis, optimizing cold starts, working within CPU time limits, and monitoring MCP servers that have no central process to check.

TL;DR

Edge runtimes give you global distribution, near-zero cold starts, and no infrastructure to manage. In exchange you give up: a persistent process, file system access, native Node.js modules, and long-lived connections. For MCP servers, this matters because the MCP protocol assumes a session — an initialize handshake followed by tool calls. On edge runtimes, each HTTP request gets a fresh execution context. Solve this by using stateless tool handlers wherever possible and externalizing session state to Cloudflare KV, Deno KV, Vercel KV, or Upstash Redis when you actually need cross-request state. Use AliveMCP for protocol-level monitoring — edge deployments have no central process, making external probing the only reliable view into what clients actually experience.

The five edge runtime constraints that affect MCP servers

All edge runtimes impose the same five constraints, regardless of provider. Understanding them upfront prevents the most common porting failures:

ConstraintImplication for MCPMitigation
No persistent processSession state from initialize is lost between requestsExternal KV/Redis; stateless tool handlers
No file systemCan't read local config files, certs, or data at runtimeEnvironment bindings / secrets store
No native Node.js modulesfs, child_process, net, native addons failVet every dependency with wrangler dev / deno check
CPU time limit (10–60s)Long-running tool handlers are killedAsync dispatch: start_job / get_job_result pattern
No long-lived connectionsSSE transport assumes a persistent process; WebSocket lifetime limitedUse StreamableHTTPServerTransport; stateless mode

The transport constraint is the most disruptive. The MCP TypeScript SDK ships three transports: stdio (for local processes), SSE (assumes a long-lived HTTP server), and StreamableHTTP (works in stateless environments). On every edge runtime, use StreamableHTTPServerTransport with sessionIdGenerator: undefined for stateless mode or paired with external KV for stateful mode.

Stateless vs stateful: when you actually need session state

Most MCP tool handlers are stateless — they receive arguments, call an API, and return a result. Stateless handlers work identically on edge runtimes and on a long-lived Node.js server. The only difference is that you instantiate a new McpServer per request instead of once at startup:

// Stateless edge MCP server — works on Cloudflare Workers, Deno Deploy, Vercel Edge
// Each request gets a fresh server instance — no shared state between requests
async function handleRequest(request: Request): Promise<Response> {
  const server = new McpServer({ name: "stateless-mcp", version: "1.0.0" });

  // Pure function tool: no session state, no side effects on server object
  server.tool("lookup_user", "Look up user by ID", { id: z.string() },
    async ({ id }) => {
      const user = await db.users.findById(id);   // db is initialized once (module scope)
      if (!user) return { isError: true, content: [{ type: "text", text: "User not found" }] };
      return { content: [{ type: "text", text: JSON.stringify(user) }] };
    }
  );

  const transport = new StreamableHTTPServerTransport({
    sessionIdGenerator: undefined,   // stateless: no session concept
  });
  await server.connect(transport);
  return transport.handleRequest(request, {});
}

Stateless handlers cover the majority of MCP use cases: API wrappers, database lookups, search tools, computation tools. Only these patterns require session state:

For those, externalize to KV with a session ID as key:

// KV-backed session state — generic pattern, works on any edge runtime
// Replace kv.get/kv.set with your provider's KV API (see below)
server.tool("add_to_draft", "Append text to the current draft",
  { text: z.string(), sessionId: z.string() },
  async ({ text, sessionId }) => {
    const existing = await kv.get(sessionId) ?? "";
    const updated = existing + "\n" + text;
    await kv.set(sessionId, updated, { ttl: 1800 }); // 30 min TTL
    return { content: [{ type: "text", text: `Draft now ${updated.length} chars.` }] };
  }
);

KV session state: provider APIs side by side

Each edge runtime has its own KV API. The pattern is identical — get, set with TTL, delete — only the syntax differs:

// Cloudflare Workers KV
const value = await env.MY_KV.get("session:abc123");
await env.MY_KV.put("session:abc123", JSON.stringify(state), { expirationTtl: 1800 });
await env.MY_KV.delete("session:abc123");

// Deno KV (built-in, no import needed on Deno Deploy)
const kv = await Deno.openKv();
const entry = await kv.get(["sessions", "abc123"]);
const value = entry.value;
await kv.set(["sessions", "abc123"], state, { expireIn: 1_800_000 }); // ms
await kv.delete(["sessions", "abc123"]);

// Vercel KV (Redis-compatible, import from @vercel/kv)
import { kv } from "@vercel/kv";
const value = await kv.get("session:abc123");
await kv.set("session:abc123", state, { ex: 1800 });
await kv.del("session:abc123");

// Upstash Redis (works on any edge runtime — HTTP-based, no TCP)
import { Redis } from "@upstash/redis";
const redis = new Redis({ url: env.UPSTASH_URL, token: env.UPSTASH_TOKEN });
const value = await redis.get("session:abc123");
await redis.set("session:abc123", state, { ex: 1800 });
await redis.del("session:abc123");

Upstash Redis is the cross-runtime choice: because it uses HTTP rather than raw TCP, it works on every edge runtime including those that block native TCP connections. The trade-off is ~5–15ms additional latency per read/write versus provider-native KV.

Cold start behavior and alert thresholds

Edge runtimes have dramatically different cold start characteristics than Node.js servers. Misconfiguring your MCP monitor's timeout threshold triggers false alerts on cold-start latency — or worse, masks real failures by setting the threshold too loose:

RuntimeCold start (typical)Recommended AliveMCP timeoutNotes
Cloudflare Workers<5ms500msV8 isolate reuse; rarely cold
Deno Deploy50–150ms1,000msV8 isolate; TypeScript compilation cached
Vercel Edge Runtime50–200ms1,000msEdge-optimized; shorter than Node.js Vercel functions
Vercel Node.js Functions100–500ms2,000msFull Node bootstrap; more variable
Bun (self-hosted VPS)50–200ms1,000msFaster than Node.js; no isolate reuse overhead

Set AliveMCP's timeout threshold at 2× the 95th-percentile cold start you observe in your dashboard — tight enough to catch real failures, loose enough to avoid cold-start false positives. On Cloudflare Workers where cold starts are sub-5ms, a 500ms threshold is appropriate.

Long-running tools: async dispatch pattern

CPU time limits (10ms on Cloudflare free, 30s on paid; 60s on most serverless plans) prevent long-running synchronous tool handlers. The async dispatch pattern replaces a blocking call with a two-tool interaction:

// Tool 1: start the job, return immediately with a job ID
server.tool("start_report", "Generate a report (async — poll for completion)",
  { filters: z.object({ start: z.string(), end: z.string() }) },
  async ({ filters }) => {
    const jobId = crypto.randomUUID();
    // Enqueue work — provider-specific (Queue, background task, etc.)
    await enqueueJob(jobId, filters);
    return { content: [{ type: "text", text: `Report started. Job ID: ${jobId}. Poll get_report_result.` }] };
  }
);

// Tool 2: poll for the result
server.tool("get_report_result", "Poll for a report result",
  { jobId: z.string() },
  async ({ jobId }) => {
    const result = await kv.get(`report:${jobId}`);
    if (!result) return { content: [{ type: "text", text: "Pending. Try again in a few seconds." }] };
    return { content: [{ type: "text", text: result }] };
  }
);

The agent (LLM) calls start_report, receives a job ID, then polls get_report_result in a loop. This pattern works on any edge runtime with no changes — the job execution happens in a separate background worker or queue consumer that has no CPU time limit.

Monitoring edge-deployed MCP servers

Edge runtimes fundamentally change the monitoring problem. A Node.js server is a single process at a known IP — you can check if the process is alive, read its memory usage, and tail its logs. Edge runtimes distribute your code across 35–300 locations with no central process. The only reliable monitoring is external protocol probing from outside the edge network.

Three classes of failures that only external monitoring catches:

# Verify your edge MCP server from the same path your clients use
curl -X POST https://mcp.example.com/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","clientInfo":{"name":"health-check","version":"1.0"}}}' \
  | jq '.result.protocolVersion'
# Expected: "2024-11-05" (or the current MCP protocol version your server declares)

Add this to your CI/CD post-deploy step. Wire it to AliveMCP for continuous 60-second probing — edge runtimes can have transient failures that a one-time CI check won't catch.

Choosing between edge and traditional server deployment

Edge runtimes are not always the right choice for MCP servers. Use this decision table:

Choose edge whenChoose traditional server when
Tools are pure functions of their inputsTools maintain state across calls in the same session
Tool execution takes <30sTool execution takes >60s (video transcoding, large ML inference)
You need global low latency (<50ms)You need private network access (internal databases, VPCs)
Burst traffic with zero scaling effortHigh per-request memory (>256MB) or CPU (native ML models)
Zero infrastructure managementWebSocket transport with long-lived connections

The hybrid pattern works well: run stateless edge functions for your most common, fast-path tools, and maintain a traditional server (Railway, Render, Fly.io) for the long-running or stateful tools. Present both under the same MCP server endpoint using a router — the agent doesn't need to know which backend handles each tool.

Frequently asked questions

Can I share state between multiple edge function invocations without external KV?

Module-level variables persist within the same isolate instance across warm requests (the isolate is reused for a short window after the first invocation). However, this is not reliable: isolates are evicted on low traffic, and on globally distributed runtimes each edge location has its own isolate pool. Never use module-level mutable state as a substitute for real session storage — treat it the same way you would an in-process cache: useful for deduplication within a burst, not for session data that must survive across requests or across edge locations.

Does StreamableHTTPServerTransport work the same as SSEServerTransport from a client's perspective?

For most clients, yes. The StreamableHTTP transport supports both synchronous (JSON response) and streaming (SSE-over-HTTP) response modes. Claude Desktop, the MCP Python SDK client, and the TypeScript SDK client all support StreamableHTTP. If you're integrating with a client that only supports the original SSE transport (it connects via GET and then sends tool calls via POST), check the client's SDK version — SSE-only clients need to upgrade or you need to run the SSE transport on a long-lived server. Most clients in 2025 and newer support StreamableHTTP.

How do I handle secrets differently across edge providers?

Each provider has a secrets mechanism: Cloudflare Workers uses wrangler secret put (accessed as env.SECRET), Deno Deploy uses the project dashboard secrets (accessed as Deno.env.get("SECRET")), Vercel uses Environment Variables (accessed as process.env.SECRET on Node.js runtime or Deno.env.get on Edge Runtime). The pattern is the same: store secrets in the provider's encrypted store, never commit them to source control. For multi-cloud setups, Doppler or Infisical can sync secrets to all providers from a single source of truth.

What's the right way to handle MCP session IDs on stateless edge runtimes?

With sessionIdGenerator: undefined, the StreamableHTTP transport runs in fully stateless mode: no session ID is generated or tracked, and each request is handled independently. Clients that send an mcp-session-id header are ignored — the server treats every request as a new session. This is correct for stateless tools. If you need session continuity, generate a UUID session ID (sessionIdGenerator: () => crypto.randomUUID()) and route by that ID to a Durable Object (Cloudflare) or use it as the KV key for state storage. The client must include the session ID header on subsequent calls — the MCP SDK handles this automatically if the client is initialized with a sessionId option.

Why can't I just use a health check HTTP endpoint instead of AliveMCP?

A /healthz endpoint tells you the HTTP server is alive. It doesn't tell you whether the MCP protocol handshake succeeds, whether the tools/list response is correct, or whether tool calls are being handled. An edge function can return 200 from /healthz while the MCP initialize handler fails due to a missing environment binding — a common post-deploy issue. AliveMCP probes the actual MCP initialize endpoint and verifies the protocol response, so it catches failures that HTTP health checks miss.

Further reading

Know when your MCP server is down — before users do

AliveMCP probes your server's MCP endpoint every minute, detects protocol errors and transport failures, and pages you before users notice.

Start monitoring free