Deployment guide · 2026-06-11 · MCP server hosting

MCP Server Hosting: Railway, Render, Vercel, AWS, and Docker Compose Compared

Every MCP server hosting decision reduces to one question that most platform comparison guides skip: does this platform maintain a persistent process between the client's initialize call and its subsequent tools/call requests? The MCP protocol is session-oriented — a client establishes a session, negotiates capabilities, and then calls tools in that session context. Platforms that kill the process between requests (serverless, free-tier sleep-on-inactivity) break this contract in ways that don't appear in simple HTTP API deployments. This guide maps five platforms — Railway, Render, Vercel, AWS ECS Fargate, and Docker Compose — against the constraints that actually matter for MCP, and explains when each is the right choice.

TL;DR decision matrix

Platform	Session model	Stateful MCP sessions	Free tier trap	Best for
Railway	Persistent process	Yes (Starter plan+)	Free tier sleeps on inactivity — cold start breaks SSE clients	Fastest path from prototype to production PaaS
Render	Persistent process	Yes (Starter plan+)	Free tier spins down after 15 min — 30–60s cold start	render.yaml Blueprint teams; zero-downtime health-gated deploys
Vercel	Serverless (per-request)	Only with external state (Redis/KV)	10s function timeout on Hobby; frozen between requests by design	Stateless tool handlers, existing Next.js apps, Edge SSE
AWS ECS Fargate	Persistent process	Yes (ALB sticky sessions)	No sleep; minimum cost ~$15–20/mo for always-on container	Production traffic, IAM integration, multi-region, enterprise
Docker Compose	Persistent process	Yes (local or self-hosted)	No managed platform — you handle uptime, TLS, and routing	Local development, self-hosted VPS, fully private deployments
Fly.io	Persistent process	Yes (session affinity via fly-prefer-region)	Free allowance covers 256 MB shared-CPU VM; idle_timeout caveat for SSE	Global edge regions, lowest latency for geographically distributed clients

The constraint that matters: persistent process vs. per-request execution

A standard REST API has no session concept. Each request arrives, the handler runs, the response is returned, and the process may or may not be reused for the next request. This makes serverless platforms a natural fit for REST APIs: there's nothing to lose when the function instance is frozen between requests.

MCP breaks this assumption. The protocol is defined by an initialize handshake that establishes capabilities, followed by a sequence of requests that may build on each other — tool calls accumulate context, resources are subscribed with change notifications, and Server-Sent Events maintain a long-lived push channel from server to client. The session is the unit of execution, not the individual request.

The practical consequence: when a platform freezes or kills your process between an initialize call and a subsequent tools/call, the session state is gone. The in-process transport object is gone. The MCP client gets a protocol error, not an HTTP error — the type of failure that most MCP clients don't retry gracefully because the protocol assumes the session is persistent.

This single constraint divides the platform landscape into two categories: persistent-process platforms (Railway, Render, AWS Fargate, Docker Compose, Fly.io) and per-request platforms (Vercel, Lambda, Cloudflare Workers). The persistent-process platforms can run any MCP server. The per-request platforms can run MCP servers with stateless tool handlers — but only if you externalize all session state to Redis or another store that survives function restarts.

Railway: fast path for persistent MCP servers

Railway is the fastest route from a working local MCP server to a hosted one. It auto-detects Node.js projects via nixpacks, provisions Postgres and Redis services with a few clicks, and manages environment variables through a project-scoped UI. For MCP specifically, three things need to be right.

First: bind to process.env.PORT. Railway assigns the port dynamically — hardcoding port 3000 means Railway routes traffic to a port your server isn't listening on. The fix is a one-line change to your server startup, but it's the most common reason Railway MCP deployments silently fail.

Second: use HTTP/SSE transport, not stdio. Railway runs containers behind its networking layer, and stdio transport requires subprocess pipes that Railway doesn't expose. Any MCP server with stdio-only transport needs a transport layer change before it can deploy anywhere cloud-hosted — Railway, Render, or AWS alike.

Third: configure a health check path that tests the MCP layer, not just the HTTP layer. Railway's built-in health check confirms your process is up and HTTP 200 is reachable, but it doesn't verify the MCP handshake is functional. The right approach is a /healthz endpoint that runs a lightweight initialize handshake internally and returns 200 only if the MCP transport can complete it.

The free tier caveat is significant: Railway's free tier sleeps services on inactivity. For an MCP server, a cold start from sleep takes 5–15 seconds — long enough to cause most MCP clients to time out before the SSE channel is established. The Starter plan ($5/mo) eliminates the sleep policy and is effectively required for any MCP server that expects to be used by external clients. For SQLite state, use Railway's persistent volumes mounted at a stable path, with WAL mode enabled (PRAGMA journal_mode = WAL) to handle concurrent reads from health checks during write-heavy operations.

Render: health-gated deploys and Blueprint teams

Render is structurally similar to Railway — persistent containers, managed Postgres and Redis, environment variable management — but with a few differences that matter for teams. The most important one for MCP servers is how Render handles deploys: a new container starts, Render sends health check requests to healthCheckPath, and traffic is only switched to the new container after the health check passes. If the new deploy can't pass the health check, Render auto-rolls back to the previous version. This makes the health check gate more consequential than on platforms that cut over immediately.

The operational implication: if your /healthz endpoint does an actual MCP initialize handshake before returning 200, Render won't route production traffic to a new deploy unless the new version's MCP layer is genuinely functional. A crashed transport layer, a misconfigured database migration, or a breaking change in tool registration causes the deploy to fail before any production client is affected. This is the right behavior, and it makes Render's deploy mechanism naturally aligned with MCP server correctness.

Render's render.yaml Blueprint format is worth knowing for teams that want infrastructure-as-code:

services:
  - type: web
    name: mcp-server
    env: node
    plan: starter        # Required — free plan spins down on inactivity
    buildCommand: npm ci && npm run build
    startCommand: node dist/index.js
    healthCheckPath: /healthz
    envVars:
      - key: NODE_ENV
        value: production
      - key: REDIS_URL
        fromService:
          name: mcp-redis
          type: redis
          property: connectionString
    disk:
      name: mcp-data
      mountPath: /data    # Mount point for SQLite file
      sizeGB: 1

  - type: redis
    name: mcp-redis
    plan: starter

The same free-tier caveat applies as Railway: Render's free web services spin down after 15 minutes of inactivity and take 30–60 seconds to cold start. An MCP client hitting a cold-started Render service gets a connection timeout, not an HTTP error — the SSE handshake just hangs until the process is ready. Use the Starter plan for any publicly accessible MCP server.

Docker Compose: local development and self-hosted production

Docker Compose occupies a different category from the managed platforms: it's not a hosting service, it's a local process orchestrator that happens to be usable for self-hosted production with the right additions. Understanding what it does and doesn't provide clarifies when it's the right choice.

What Compose gives you is multi-service orchestration with dependency ordering. Your MCP server, Redis, Postgres, and any background workers start in the right order (via depends_on with health check conditions), share a private network by service name, and have access to a shared .env file for secrets. The container isolation means your MCP server behaves identically in local development and in a self-hosted VPS deployment, eliminating an entire class of "works on my machine" production surprises.

The key Compose pattern for MCP server reliability is dependency ordering with health conditions rather than simple ordering:

services:
  mcp-server:
    build: .
    ports:
      - "3000:3000"
    depends_on:
      redis:
        condition: service_healthy
      db-migrate:
        condition: service_completed_successfully  # Block until migration exits 0
    environment:
      - NODE_ENV=production
    volumes:
      - ./data:/data

  db-migrate:
    build: .
    command: node dist/migrate.js
    depends_on:
      postgres:
        condition: service_healthy

  redis:
    image: redis:7-alpine
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 5

  postgres:
    image: postgres:16-alpine
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "mcpuser"]
      interval: 5s
      timeout: 3s
      retries: 5

The service_completed_successfully condition for the migration service is the important part: your MCP server won't start until the migration process has run its final statement and exited 0. Without this, there's a race between the server starting and the database schema being ready.

What Compose doesn't give you for self-hosted production: TLS termination, automatic Let's Encrypt certificate provisioning, ingress routing, or any form of platform-level monitoring. The conventional addition is Traefik as a fourth service in the same Compose file — Traefik handles TLS via Let's Encrypt, reverse-proxies HTTPS to your MCP server's HTTP port, and exposes a dashboard for routing inspection. The MCP server, Redis, and Postgres are then on an internal Docker network that Traefik can reach but external traffic cannot.

Compose is the right choice when you're in active local development (the fastest iteration loop, no cloud round-trip), when you need a fully private deployment with no data leaving your infrastructure, or when you're operating a VPS directly and want the same config that runs locally to run in production.

Vercel: stateless tools work, stateful sessions need engineering

Vercel is the outlier in this comparison — it's a serverless platform, and the MCP session model conflicts with serverless by default. Understanding exactly where the conflict is (and isn't) determines when Vercel is a viable choice.

The conflict is at the session boundary: Vercel functions are frozen or killed between requests. The MCP initialize handler runs, stores the transport object in memory, and returns. When the next tools/call arrives, it may go to a completely different function instance — or the same instance after it was frozen and thawed. Either way, the in-memory transport object is gone. The transport is the session.

The resolution is to use Vercel's StreamableHTTPServerTransport in stateless mode (sessionIdGenerator: undefined) combined with Vercel KV or Upstash Redis to externalize any session state your tools need:

// app/api/mcp/route.ts
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
import { kv } from '@vercel/kv';

export async function POST(req: Request) {
  const body = await req.json();

  // Stateless transport — no session ID, each POST is self-contained
  const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined });
  const server = new McpServer({ name: 'my-mcp', version: '1.0.0' });

  server.tool('get-data', { id: z.string() }, async ({ id }) => {
    // All state must be fetched from KV on each invocation
    const cached = await kv.get(`data:${id}`);
    if (cached) return { content: [{ type: 'text', text: JSON.stringify(cached) }] };

    const data = await fetchFromDatabase(id);
    await kv.set(`data:${id}`, data, { ex: 1800 }); // 30-minute TTL
    return { content: [{ type: 'text', text: JSON.stringify(data) }] };
  });

  await server.connect(transport);
  return transport.handleRequest(req);
}

The cases where Vercel works well without the session-state workaround are MCP servers whose tools are stateless by nature: lookup tools (database reads by ID), transformation tools (format conversion, text processing), and integration tools that call external APIs and return results without accumulating state. If every tool call is self-contained — the input contains everything the tool needs — the serverless model matches the use case.

The cases where Vercel struggles: anything involving tool call sequences that build on each other (tool A produces an intermediate result that tool B uses), long-running computations (Hobby plan: 10s timeout; Pro plan: 60s), and SSE-based streaming that needs a persistent connection beyond a single response body.

The honest guidance: if your MCP server already lives in a Next.js app and its tools are stateless, Vercel is the natural hosting choice. If you're designing an MCP server from scratch and need sessions with accumulated state, start with Railway or Render — the architectural simplicity is worth the cost difference.

AWS ECS Fargate: production-grade with more configuration surface

AWS ECS Fargate is the right choice when you need enterprise-grade controls that managed PaaS platforms don't expose: IAM task roles for AWS SDK credential injection (no hardcoded keys, no environment variables with secrets), private VPC networking between your MCP server and its databases, Application Load Balancer sticky sessions to ensure MCP clients maintain session affinity, and fine-grained IAM policies that restrict what your MCP server can do within AWS.

The two configuration details that matter most for MCP on Fargate are sticky sessions on the ALB target group and stopTimeout on the container definition. Sticky sessions use a session cookie to route all requests in an MCP session to the same container instance. Without stickiness, an SSE connection established to container A can have its subsequent tools/call requests routed to container B — which has no session state for that client. The fix is a single target group attribute:

{
  "stickiness.enabled": "true",
  "stickiness.type": "lb_cookie",
  "stickiness.lb_cookie.duration_seconds": "3600"  // 1 hour — matches MCP session lifetime
}

The stopTimeout setting gives your container time to drain active MCP sessions before ECS terminates it during a deployment or scale-in event. The default is 30 seconds — too short for an MCP session that's executing a long-running tool. Setting it to 60 seconds gives active sessions time to complete their current tool call and close gracefully:

{
  "stopTimeout": 60,           // ECS waits this long for graceful shutdown
  "healthCheck": {
    "command": ["CMD-SHELL", "curl -f http://localhost:3000/healthz || exit 1"],
    "interval": 30,
    "timeout": 5,
    "retries": 3,
    "startPeriod": 20          // Don't count failures during initial startup
  }
}

AWS App Runner is a simpler alternative for teams who want managed container hosting without VPC configuration: App Runner auto-provisions a load balancer, handles TLS, and auto-scales based on request volume. The tradeoff is that App Runner's load balancer doesn't support sticky sessions, which makes it suitable only for stateless MCP servers or stateless tool handlers with external session storage. For full stateful MCP sessions, ECS Fargate's ALB stickiness is required.

AWS Lambda has the same fundamental limitation as Vercel: the execution environment is frozen between invocations, and the 15-minute function timeout means even a Lambda Provisioned Concurrency configuration can't maintain a session indefinitely. Lambda is not a good fit for MCP servers that maintain session state — the session lifetime caps at the function timeout, and cold starts introduce multi-hundred-millisecond delays that violate SSE connection expectations.

How to choose: four decision paths

The platforms sort into four decision paths based on what your constraints actually are.

Fastest path to a working hosted MCP server: Railway. Nixpacks auto-detects Node.js, the PORT env var binding is a one-line change, and you can add Redis or Postgres in two clicks. The only non-negotiable: Starter plan ($5/mo) to avoid sleep-on-inactivity. Total time from working local MCP server to hosted: under one hour, mostly waiting for the initial build.

Team development with infrastructure-as-code: Render. The render.yaml Blueprint is code-reviewable and reproducible. Health-gated deploys mean a broken MCP transport layer auto-rolls back before clients are affected. The operational model maps well to a team that code-reviews infrastructure changes.

Stateless tools embedded in an existing Next.js app: Vercel. If you're already on Vercel and your MCP tools don't need session state beyond what the client sends with each call, the existing deployment pipeline handles it. No new infrastructure, no new bills.

Production enterprise deployment or existing AWS footprint: AWS ECS Fargate. IAM task roles, private VPC, ALB sticky sessions, CloudWatch Logs, and integration with every other AWS service. Higher configuration cost but the right choice when compliance, audit logging, or existing AWS infrastructure governs where you can deploy.

Local development or fully private VPS: Docker Compose. Identical configuration runs locally and in production. No platform dependency, no data leaving your infrastructure, full control over every service. Add Traefik for TLS and ingress if exposing to the public internet.

What changes about monitoring across platforms

Every platform provides infrastructure-level health checking — HTTP 200 from your /healthz endpoint, process restart on crash, container health status. None of them verify that the MCP protocol layer is functioning correctly. The difference between "process is up and returning HTTP 200" and "MCP initialize handshake succeeds and tools list is correctly advertised" is the difference between a health check and a protocol probe.

The failures that platform health checks miss:

A transport initialization bug that causes the MCP handshake to hang without returning an error (HTTP 200 from /healthz, but SSE connections stall indefinitely)
A tool registration error that happens after the server starts, causing tools/list to return an empty list (health check passes, but the server is useless)
A database connection pool exhaustion that causes tool handlers to queue indefinitely (health check endpoint doesn't acquire a connection, so it passes; tool calls don't return)
A TLS certificate expiry that breaks HTTPS clients while the internal HTTP health check continues to pass (health checks are usually on the internal port, not the public TLS endpoint)

These failure modes are invisible to Railway, Render, Vercel, AWS ALB health checks, and Docker Compose health commands alike. External protocol monitoring — a probe that connects over the same path your users do, sends a real MCP initialize request, validates the tool list, and measures end-to-end latency — is the only monitoring that catches all of them. AliveMCP runs this kind of probe every 60 seconds for every public MCP endpoint and notifies you within 15 minutes of a protocol failure, regardless of which platform you're on.

The monitoring setup is identical whether you're on Railway, Render, or AWS: add your server's public domain to AliveMCP, and the probe runs automatically. The only platform-specific difference is Vercel's function timeout — a probe that sends an initialize request and doesn't get a response within 10 seconds (Hobby) or 60 seconds (Pro) will correctly classify the server as down, even if the function is technically running.

The one thing all these platforms have in common

Every platform in this comparison — managed PaaS, serverless, container orchestration, self-hosted Compose — shares one property: a crashed or misconfigured MCP server fails silently from the LLM client's perspective. The agent calls a tool, the tool call times out or returns a protocol error, and the agent either retries (making the problem worse if the server is under load) or gives up (making the problem invisible to you until a user complains).

The correct mental model for MCP server operations borrows from web API ops: if your REST API were down for 15 minutes, you'd want to know within 60 seconds. If your MCP server's initialize handshake is failing, the LLM sessions that depend on it are silently broken at the protocol level — and neither Railway's dashboard, Render's health gate, Vercel's function metrics, nor AWS CloudWatch will tell you. External protocol monitoring closes that gap.

See also: the MCP server deployment guide for PM2, systemd, nginx, and zero-downtime deployment patterns for VPS deployments, and MCP server production resilience patterns for backpressure, idempotency, and schema evolution in production MCP servers.