Guide · Edge Runtimes

MCP server on Cloudflare Workers

Cloudflare Workers runs your MCP server in V8 isolates at 300+ edge locations worldwide — sub-5ms cold starts, global distribution, and zero infrastructure to manage. The trade-off is a runtime that is not Node.js: no process-level globals, no native Node modules, and SSE connections that live for the duration of a single HTTP request. This guide shows you the patterns that work in that environment and the monitoring challenges that come with distributing your MCP server across the planet.

TL;DR

Cloudflare Workers supports MCP servers via the StreamableHTTPServerTransport in the MCP SDK — each HTTP request gets its own isolate context with no shared memory between requests. Use Durable Objects if you need to preserve session state across tool calls (common for MCP servers that maintain conversation context). Monitor edge-deployed MCP servers with AliveMCP to verify the protocol handshake from the same external path your clients use — Cloudflare's edge makes internal health checks impossible, so external probing is your only view into what users actually experience.

V8 isolates vs Node.js: what changes for MCP

Cloudflare Workers does not run Node.js. Your Worker runs inside a V8 isolate — a lightweight JavaScript context with no file system, no native Node.js modules (fs, child_process, net, tls), and no access to the operating system. What this means for MCP server development:

Capability	Node.js MCP server	Workers MCP server
File system access	Yes (`fs`, `path`)	No — use R2 or KV for storage
TCP connections	Yes (`net`)	No — use fetch() for outbound HTTP only
Process env vars	`process.env.FOO`	`env.FOO` from Worker bindings
npm packages	Most packages work	Node.js-API-dependent packages fail
Cold start	100ms–2s (Node bootstrap)	<5ms (V8 isolate reuse)
Max execution time	Unlimited (long-lived process)	30s (CPU time) per request
SSE connection lifetime	Hours (long-lived process)	Duration of HTTP request (up to 30s)

The @modelcontextprotocol/sdk package works in Workers with two caveats: it must be imported from the ESM-compatible entry point, and you must use the StreamableHTTPServerTransport rather than the SSE-specific transport that assumes a long-lived Node process.

Basic MCP server on Workers

The minimal Workers MCP server uses the Fetch API handler and the MCP SDK's streamable HTTP transport. Install with npm install @modelcontextprotocol/sdk and create src/index.ts:

// src/index.ts — MCP server on Cloudflare Workers
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import { z } from "zod";

export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    if (request.method !== "POST") {
      return new Response("Method not allowed", { status: 405 });
    }

    const server = new McpServer({
      name: "my-mcp-server",
      version: "1.0.0",
    });

    // Register tools using env bindings instead of process.env
    server.tool(
      "search_docs",
      "Search the documentation index",
      { query: z.string().describe("Search query") },
      async ({ query }) => {
        // Use fetch() for outbound HTTP — no native Node TCP
        const res = await fetch(`${env.DOCS_API_URL}/search?q=${encodeURIComponent(query)}`, {
          headers: { Authorization: `Bearer ${env.DOCS_API_KEY}` },
        });
        if (!res.ok) throw new Error(`Docs API error: ${res.status}`);
        const data = await res.json();
        return { content: [{ type: "text", text: JSON.stringify(data) }] };
      }
    );

    const transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: () => crypto.randomUUID(),
    });

    // connect() runs the MCP handshake synchronously before returning
    await server.connect(transport);
    return transport.handleRequest(request, { waitUntil: ctx.waitUntil.bind(ctx) });
  },
} satisfies ExportedHandler<Env>;

interface Env {
  DOCS_API_URL: string;
  DOCS_API_KEY: string;
}

The wrangler.toml binds your secrets as environment variables rather than process environment:

# wrangler.toml
name = "my-mcp-server"
main = "src/index.ts"
compatibility_date = "2026-01-01"
compatibility_flags = ["nodejs_compat"]  # enables subset of Node.js APIs

[vars]
DOCS_API_URL = "https://docs.example.com/api"

# Secrets added via: wrangler secret put DOCS_API_KEY
# Access at runtime as env.DOCS_API_KEY (never in wrangler.toml)

[[routes]]
pattern = "mcp.example.com/*"
zone_name = "example.com"

The nodejs_compat flag enables the most commonly needed Node.js APIs (Buffer, crypto, stream polyfills) without running a full Node runtime. The MCP SDK's SSE and WebSocket transports may still fail — use StreamableHTTPServerTransport which is designed to work in stateless HTTP environments.

Stateful sessions with Durable Objects

V8 isolates are stateless by default: each request gets a fresh isolate with no shared memory. For most MCP tools this is fine — tools/call is stateless at the protocol level. But some MCP server patterns require state across calls: maintaining a user's in-progress task list, accumulating file edits before committing, or holding a database transaction open.

Cloudflare Durable Objects solve this: each Durable Object instance is a long-lived actor with a guaranteed-single execution context and persistent storage. Route each MCP session to a dedicated Durable Object:

// src/session.ts — Durable Object for stateful MCP sessions
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import { z } from "zod";
import { DurableObjectState } from "@cloudflare/workers-types";

export class MCPSession {
  private state: DurableObjectState;
  private server: McpServer;

  constructor(state: DurableObjectState, env: Env) {
    this.state = state;

    this.server = new McpServer({ name: "stateful-mcp", version: "1.0.0" });

    this.server.tool(
      "add_task",
      "Add a task to the current session's task list",
      { task: z.string() },
      async ({ task }) => {
        const tasks: string[] = (await this.state.storage.get("tasks")) ?? [];
        tasks.push(task);
        await this.state.storage.put("tasks", tasks);
        return { content: [{ type: "text", text: `Added. Task list now has ${tasks.length} items.` }] };
      }
    );

    this.server.tool(
      "list_tasks",
      "List all tasks in the current session",
      {},
      async () => {
        const tasks: string[] = (await this.state.storage.get("tasks")) ?? [];
        return { content: [{ type: "text", text: tasks.length ? tasks.join("\n") : "No tasks yet." }] };
      }
    );
  }

  async fetch(request: Request): Promise<Response> {
    const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: () => this.state.id.toString() });
    await this.server.connect(transport);
    return transport.handleRequest(request, {});
  }
}

// src/index.ts — route requests to Durable Objects by session ID
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const sessionId = request.headers.get("mcp-session-id") ?? crypto.randomUUID();
    const id = env.MCP_SESSION.idFromName(sessionId);
    const stub = env.MCP_SESSION.get(id);
    return stub.fetch(request);
  },
};

interface Env {
  MCP_SESSION: DurableObjectNamespace;
}

# wrangler.toml — Durable Object binding
[[durable_objects.bindings]]
name = "MCP_SESSION"
class_name = "MCPSession"

[[migrations]]
tag = "v1"
new_classes = ["MCPSession"]

Each MCP client passes a mcp-session-id header to be routed to the same Durable Object across calls. Durable Objects have a 30-day idle eviction policy — a session not accessed for 30 days loses its stored state. For long-lived projects, implement a heartbeat or migrate state to KV/R2 at session close.

Environment bindings: the Workers equivalent of process.env

Node.js MCP servers read secrets from process.env and files. Workers read from bindings — typed references to secrets, KV namespaces, R2 buckets, and service workers that Cloudflare injects at runtime:

# Add secrets (never put in wrangler.toml or commit to git)
wrangler secret put OPENAI_API_KEY
wrangler secret put DATABASE_URL

# List secrets for a deployment
wrangler secret list

# KV namespace for cached tool results
wrangler kv:namespace create "MCP_CACHE"
# Returns: { id: "abc123..." } — add to wrangler.toml:
# [[kv_namespaces]]
# binding = "CACHE"
# id = "abc123..."

// Accessing bindings in tools
server.tool("get_cached", "Get a cached value", { key: z.string() }, async ({ key }, { env }) => {
  // KV for read-heavy lookups
  const cached = await env.CACHE.get(key);
  if (cached) return { content: [{ type: "text", text: cached }] };

  // Fetch fresh, store in KV with 1-hour TTL
  const fresh = await fetchFreshData(key, env.DATABASE_URL);
  await env.CACHE.put(key, JSON.stringify(fresh), { expirationTtl: 3600 });
  return { content: [{ type: "text", text: JSON.stringify(fresh) }] };
});

Monitoring edge-deployed MCP servers

Cloudflare Workers deployments present a unique monitoring challenge: your server runs across 300+ edge locations simultaneously, with no central process to monitor. Traditional uptime monitoring that pings one IP address only tests the nearest edge location — a failure in Frankfurt won't be caught by a probe from San Jose.

Three classes of failures affect Workers MCP servers that internal health checks cannot catch:

Region-specific cold start failures — V8 isolate cold starts are fast (<5ms) but not zero. A timeout threshold set too tight for cold-start latency causes false timeouts in low-traffic regions where isolates frequently need to be restarted.
Durable Object eviction — A Durable Object idle for 30+ days is evicted. The first call after eviction triggers an initialization path that can fail if storage is corrupt or the Durable Object class has changed since the last eviction. The failure shows as a 500 from outside, but nothing in your code path triggered it.
Wrangler deploy to wrong environment — Workers supports staging and production environments. A deploy command with a missing --env production flag updates staging. Your internal validation passes; users hitting mcp.example.com (production) see the old version.

AliveMCP probes your Workers MCP endpoint externally on the full protocol path — sending a real initialize request and verifying the protocolVersion response — from outside Cloudflare's edge. This catches protocol-level failures that HTTP status monitoring misses: a 200 response with a malformed MCP body still looks like an outage to any agent client trying to run a tool call.

# Verify your Workers deployment is serving the MCP protocol correctly
curl -X POST https://mcp.example.com/ \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","clientInfo":{"name":"health-check","version":"1.0"}}}' | jq .

# Expected response includes:
# { "result": { "protocolVersion": "2024-11-05", "serverInfo": {...}, "capabilities": {...} } }

Add this check to your post-deployment verification script. In CI/CD (GitHub Actions or Cloudflare Workers CI), run it after wrangler deploy to catch deploy failures before they affect users.

CPU time limits and long-running tools

Cloudflare Workers enforces a 30-second CPU time limit per request (10ms on the free plan). Long-running MCP tools — web scraping, large data transformations, multi-step API orchestration — will hit this limit and return a 1101 error to clients.

Two patterns handle this constraint:

// Pattern 1: Async dispatch to a Queue — initiate work, poll for results
import { Queue } from "@cloudflare/workers-types";

server.tool("start_scrape", "Start a scrape job (async)", { url: z.string() }, async ({ url }, { env }) => {
  const jobId = crypto.randomUUID();
  await env.SCRAPE_QUEUE.send({ jobId, url });
  return { content: [{ type: "text", text: `Scrape started. Job ID: ${jobId}. Poll get_scrape_result to check status.` }] };
});

server.tool("get_scrape_result", "Poll scrape job result", { jobId: z.string() }, async ({ jobId }, { env }) => {
  const result = await env.RESULTS_KV.get(jobId);
  if (!result) return { content: [{ type: "text", text: "Job pending or not found." }] };
  return { content: [{ type: "text", text: result }] };
});

// Pattern 2: Use ctx.waitUntil() for background work after response
server.tool("cache_warm", "Warm cache for common queries", { keys: z.array(z.string()) }, async ({ keys }, { env, ctx }) => {
  // Respond immediately, do expensive work in background
  ctx.waitUntil(
    Promise.all(keys.map(key => fetchAndCache(key, env)))
  );
  return { content: [{ type: "text", text: `Warming ${keys.length} cache entries in background.` }] };
});

Pattern 1 (async dispatch) is required for any operation that genuinely takes more than 30 seconds. Pattern 2 (waitUntil) works for background work that doesn't need to return a result to the agent — the response is sent immediately, then background work continues for up to 30 seconds of CPU time after the response.

Frequently asked questions

Can I use the standard Node.js MCP SDK on Cloudflare Workers?

With restrictions. The @modelcontextprotocol/sdk package is published as ESM and does not depend on native Node.js modules, so the core server and transport classes work. Enable the nodejs_compat compatibility flag in wrangler.toml to get Buffer, stream, and crypto polyfills. What won't work: any transport that opens a raw TCP socket (the stdio transport), any tool that uses fs or child_process, and packages with native bindings (e.g., better-sqlite3). Test your full tool set with wrangler dev before deploying — the local miniflare runtime catches most Workers-specific incompatibilities.

How do I handle secrets in Workers without exposing them in wrangler.toml?

Use wrangler secret put SECRET_NAME to upload secrets directly to Cloudflare's encrypted secret store — they never appear in your source files or wrangler.toml. Access them at runtime via env.SECRET_NAME in your Worker handler. For local development, create a .dev.vars file (which wrangler dev reads) with your local values — this file should be in .gitignore. Never put credentials in [vars] in wrangler.toml — those are committed to git and are not encrypted at rest.

Does Cloudflare Workers support WebSocket transport for MCP?

Yes, but with caveats. Workers supports WebSocket connections via the WebSocket API, and Cloudflare now offers WebSocket hibernation in Durable Objects — connections can persist without consuming CPU billing. The MCP TypeScript SDK includes a WebSocket-compatible transport, but you'll need to pair it with a Durable Object to maintain the server-side session across WebSocket frames. Pure stateless Workers cannot maintain WebSocket state. For most MCP use cases, StreamableHTTPServerTransport is simpler and avoids WebSocket upgrade complexity.

What's the correct way to monitor a Workers MCP server with AliveMCP?

Add your Workers URL (e.g., https://mcp.example.com) as a monitor in AliveMCP with the MCP protocol check type. AliveMCP sends a full initialize JSON-RPC request and verifies the response includes a valid protocolVersion — not just an HTTP 200. This catches Workers-specific issues: a 200 response from Cloudflare's edge when your Worker threw an exception (Workers returns 500 on unhandled exceptions, but Cloudflare's error page itself is an HTTP 200); schema drift where the tools/list returns an empty array because tool registration failed; and region-specific failures if Cloudflare routes AliveMCP's probe to a different edge location than your usual test client.

How do I debug a Workers MCP server that works locally but fails in production?

The most common cause is a Node.js API used in a tool handler that isn't polyfilled by nodejs_compat. Run wrangler dev --remote to test against an actual Workers runtime rather than miniflare's emulation. Use wrangler tail to stream production logs in real time — Workers logs include the full stack trace on unhandled exceptions. If a tool only fails intermittently, check CPU time consumption with wrangler metrics — tools that aggregate large datasets may exceed the CPU time limit on large inputs even when they pass on typical test data.