Guide · AWS

MCP server on AWS

AWS offers three practical options for deploying MCP servers: ECS Fargate (persistent containers behind an Application Load Balancer), App Runner (managed container hosting with less configuration), and Lambda (serverless, but with fundamental limitations for MCP sessions). ECS Fargate is the right choice for production MCP servers that need session affinity, IAM-based credential access, and predictable performance. App Runner works well for simpler cases. Lambda works only for fully stateless, short-lived tool handlers.

TL;DR

Use ECS Fargate for production: define a task with your MCP server container, put an Application Load Balancer in front with target group stickiness enabled, use an IAM task role instead of hardcoded credentials, and store secrets in AWS Secrets Manager (not environment variables in the task definition). For simpler cases without custom networking requirements, App Runner auto-scales and manages the load balancer for you. Lambda is not suitable for MCP servers that maintain session state — use it only for stateless tool handlers with the same caveats as Vercel. Monitor the public ALB endpoint with AliveMCP for external protocol verification.

ECS Fargate task definition

A minimal ECS task definition for an MCP server:

{
  "family": "mcp-server",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::ACCOUNT:role/mcp-server-task-role",
  "containerDefinitions": [
    {
      "name": "mcp-server",
      "image": "ACCOUNT.dkr.ecr.REGION.amazonaws.com/mcp-server:latest",
      "portMappings": [
        { "containerPort": 3000, "protocol": "tcp" }
      ],
      "environment": [
        { "name": "NODE_ENV", "value": "production" },
        { "name": "PORT", "value": "3000" }
      ],
      "secrets": [
        {
          "name": "DATABASE_URL",
          "valueFrom": "arn:aws:secretsmanager:REGION:ACCOUNT:secret:mcp-server/database-url"
        },
        {
          "name": "REDIS_URL",
          "valueFrom": "arn:aws:secretsmanager:REGION:ACCOUNT:secret:mcp-server/redis-url"
        }
      ],
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -sf http://localhost:3000/healthz || exit 1"],
        "interval": 15,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 20
      },
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/mcp-server",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "stopTimeout": 60
    }
  ]
}

Key points: taskRoleArn grants permissions to your running container (e.g., S3 read, DynamoDB write) without hardcoded credentials — the container gets short-lived credentials from IMDS. executionRoleArn grants permissions to ECS to pull the image and read Secrets Manager values. secrets injects Secrets Manager values as environment variables at container start, with values never appearing in CloudTrail logs or the ECS console. stopTimeout: 60 gives the container 60 seconds to drain sessions after SIGTERM.

IAM task role — no hardcoded credentials

The taskRoleArn grants AWS permissions to your running MCP server without hardcoded access keys. Create a task role with least-privilege permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::my-mcp-bucket/*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "secretsmanager:GetSecretValue"
      ],
      "Resource": "arn:aws:secretsmanager:*:*:secret:mcp-server/*"
    }
  ]
}

In your Node.js code, the AWS SDK automatically picks up task role credentials from IMDS — no configuration needed:

import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';

// No credentials needed — SDK gets them from IMDS via task role
const s3 = new S3Client({ region: 'us-east-1' });

server.tool('get-document', { key: z.string() }, async ({ key }) => {
  const cmd = new GetObjectCommand({ Bucket: 'my-mcp-bucket', Key: key });
  const response = await s3.send(cmd);
  const body = await response.Body?.transformToString();
  return { content: [{ type: 'text', text: body ?? '' }] };
});

Application Load Balancer with session stickiness

For MCP servers that maintain per-session in-memory state, configure ALB target group stickiness so requests from the same session always route to the same ECS task:

# AWS CLI — enable stickiness on the target group
aws elbv2 modify-target-group-attributes \
  --target-group-arn arn:aws:elasticloadbalancing:... \
  --attributes \
    Key=stickiness.enabled,Value=true \
    Key=stickiness.type,Value=lb_cookie \
    Key=stickiness.lb_cookie.duration_seconds,Value=3600

The ALB sets a cookie (AWSALB) on the first response. Subsequent requests from the same client include this cookie, and the ALB routes them to the same target. Session duration should match the longest expected MCP session — 3600 seconds (1 hour) is a reasonable default.

Configure the ALB health check on the target group to hit /healthz:

aws elbv2 modify-target-group \
  --target-group-arn arn:aws:elasticloadbalancing:... \
  --health-check-path /healthz \
  --health-check-interval-seconds 15 \
  --healthy-threshold-count 2 \
  --unhealthy-threshold-count 3

The ALB only sends traffic to healthy targets. If an ECS task fails its health check, the ALB stops routing to it — but existing sticky sessions will receive connection errors on the next request. Design your MCP clients to handle reconnection gracefully.

AWS App Runner — simpler alternative

AWS App Runner manages the container orchestration, load balancer, auto-scaling, and TLS certificate for you. It's less configurable than ECS Fargate but requires far less setup. Suitable for MCP servers that don't need custom VPC networking or fine-grained ALB configuration:

# apprunner.yaml (infrastructure as code via AWS CDK or CloudFormation)
Type: AWS::AppRunner::Service
Properties:
  ServiceName: mcp-server
  SourceConfiguration:
    ImageRepository:
      ImageIdentifier: ACCOUNT.dkr.ecr.REGION.amazonaws.com/mcp-server:latest
      ImageRepositoryType: ECR
      ImageConfiguration:
        Port: "3000"
        RuntimeEnvironmentVariables:
          NODE_ENV: production
          PORT: "3000"
  InstanceConfiguration:
    Cpu: 0.5 vCPU
    Memory: 1 GB
  HealthCheckConfiguration:
    Protocol: HTTP
    Path: /healthz
    HealthyThreshold: 2
    UnhealthyThreshold: 3
    Interval: 10

App Runner limitations relevant to MCP servers: no session stickiness (all requests go to any healthy instance — requires stateless or externalized session state), no access to VPC resources by default (requires App Runner VPC Connector for private Redis/RDS), and no persistent volumes (use S3 or EFS for durable storage).

Why Lambda doesn't work for most MCP servers

Lambda functions are invoked per-request and frozen between requests. The MCP session model — initialize → tools/list → tool calls → session close — requires state to persist across multiple HTTP requests. Lambda has three problems for this:

No persistent SSE connections — Lambda can return streaming responses (via Function URLs with RESPONSE_STREAM mode), but SSE-over-Lambda has a hard 15-minute function timeout and each invocation is independent.
Cold starts add latency to initialize — A Lambda cold start for a Node.js MCP server is 100–800ms. This adds to every new session's initialize handshake.
In-memory session state dies with the function — Between two tool calls in the same session, Lambda may invoke a different instance. Any in-memory state from the first call is gone.

Lambda works if your MCP server is genuinely stateless: each tool call is a pure function of its inputs, with no session context or accumulated history needed. See MCP server on Vercel for the same tradeoffs in a more developer-friendly serverless platform.

CloudWatch logging and metrics

The awslogs log driver in the task definition sends all container stdout/stderr to CloudWatch Logs. Structure your logs as JSON for easy filtering:

// Structured logging — CloudWatch can filter on these fields
console.log(JSON.stringify({
  level: 'info',
  event: 'tool_call',
  tool: toolName,
  sessionId,
  durationMs: Date.now() - start,
  success: true
}));

Create CloudWatch metric filters on the log group to extract metrics like tool call duration and error rate. Use these to set CloudWatch Alarms that notify your team when error rates spike.

ECS also publishes container-level CPU and memory metrics to CloudWatch automatically. Set up an alarm on memory utilization > 80% — MCP servers that accumulate session context can exhaust memory if session cleanup isn't working correctly.

External monitoring beyond CloudWatch

CloudWatch metrics and alarms tell you whether your ECS tasks are running, consuming CPU/memory, and logging errors. They don't tell you whether the MCP protocol is functioning correctly from outside AWS. A misconfigured ALB listener rule, an expired ACM certificate, or a DNS propagation issue causes all external MCP clients to fail while CloudWatch shows healthy tasks.

Add your ALB domain (https://mcp.yourdomain.com) or App Runner URL to AliveMCP. AliveMCP probes from outside AWS, running the full initialize → tools/list sequence over HTTPS, and alerts when the protocol layer fails — including infrastructure failures that CloudWatch can't see. See MCP server observability for combining CloudWatch, distributed tracing (X-Ray), and external monitoring into a complete picture.