Guide · Docker
MCP server Docker
Containerizing an MCP server requires three decisions that a standard Node.js Dockerfile doesn't consider: transport selection (stdio doesn't work inside a container the way it works on a developer machine), graceful SIGTERM handling (Docker stop sends SIGTERM before SIGKILL — your server must flush active sessions in that window), and health checks (the Docker HEALTHCHECK directive can run a shell command but not a JSON-RPC handshake, so supplement it with an external probe).
TL;DR
Use HTTP/SSE transport inside a container — stdio is a pipe between processes and doesn't work across the container boundary. Handle SIGTERM explicitly: stop accepting new sessions, wait up to 30 seconds for active sessions to complete, then exit. Add a Docker HEALTHCHECK that sends a minimal initialize request; pair it with AliveMCP for external probing from outside the container network. Set memory limits appropriate for your tool workload — tool calls that spawn subprocesses or load large files can spike well above baseline RSS. Use multi-stage builds to keep the image small.
Why stdio doesn't work inside Docker
When an MCP client uses stdio transport, it spawns the server as a child process and communicates via stdin/stdout pipes. Inside a Docker container, there's no parent process to spawn from — the container is the process. You can technically pipe stdin/stdout into a container with docker run -i, but then:
- Only one client can connect at a time (the pipe is point-to-point).
- No reverse proxy can sit in front of it — no TLS, no load balancing.
- No external probe can reach it — AliveMCP and any other monitoring tool requires an HTTP endpoint.
- Health checks from Docker or Kubernetes require a network endpoint or a shell command that simulates one.
HTTP/SSE transport is what you want inside a container. It's also what you want for any server accessed by remote clients over a network. See MCP server deployment for transport selection in more depth.
Dockerfile for a Node.js MCP server
A production-grade Dockerfile for a Node.js MCP server using HTTP/SSE transport:
# syntax=docker/dockerfile:1
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
FROM node:22-alpine AS runtime
WORKDIR /app
# Copy only production dependencies
COPY --from=deps /app/node_modules ./node_modules
COPY . .
# Non-root user for security
RUN addgroup -S mcp && adduser -S mcp -G mcp
USER mcp
EXPOSE 3000
# Respond to Docker health checks
HEALTHCHECK --interval=30s --timeout=10s --start-period=10s --retries=3 \
CMD wget -qO- --post-data='{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"healthcheck","version":"1"}}}' \
--header='Content-Type: application/json' \
http://localhost:3000/mcp | grep -q protocolVersion || exit 1
CMD ["node", "--enable-source-maps", "index.js"]
Key decisions in this Dockerfile:
- Multi-stage build: the
depsstage installs only production dependencies (--omit=dev). Theruntimestage copies fromdeps, keeping dev tools (TypeScript compiler, test runner, etc.) out of the production image. - Alpine base: 60–80% smaller than
node:22. If your tools need glibc (native addons, some database drivers), usenode:22-sliminstead of Alpine. - Non-root user: the MCP server runs as
mcp, not root. This limits the blast radius if a tool exposes a code execution vulnerability. - HEALTHCHECK: sends a real
initializerequest to port 3000 usingwget(available in Alpine). Docker marks the container unhealthy if the response doesn't containprotocolVersion. The--start-period=10sgives the server time to start before checks begin.
Signal handling and graceful shutdown
When you run docker stop, Docker sends SIGTERM to PID 1. After a configurable grace period (default 10 seconds), it sends SIGKILL. If your server ignores SIGTERM, it will be killed with active sessions open — the client receives a connection error and the tool call result is lost.
If you launch Node.js with CMD ["node", "index.js"], Node is PID 1 and receives SIGTERM directly. If you use a shell form CMD node index.js, the shell is PID 1 and may not forward SIGTERM to Node. Always use the exec form (JSON array) for CMD.
Graceful shutdown in the MCP server application code:
const activeSessions = new Set();
server.on('session', (session) => {
activeSessions.add(session);
session.on('close', () => activeSessions.delete(session));
});
process.on('SIGTERM', async () => {
console.log('SIGTERM received — draining sessions');
// Stop accepting new connections
httpServer.close();
// Wait for active sessions to finish, up to 30 seconds
const deadline = Date.now() + 30_000;
while (activeSessions.size > 0 && Date.now() < deadline) {
await new Promise(r => setTimeout(r, 500));
}
if (activeSessions.size > 0) {
console.warn(`Forcibly terminating ${activeSessions.size} sessions`);
}
process.exit(0);
});
Increase Docker's stop grace period to match your drain window: docker stop --time=35 <container> or set stop_grace_period: 35s in your compose file. If the drain window exceeds the grace period, Docker kills the container before sessions drain.
Resource limits
MCP tool calls can be memory-intensive in ways that a baseline RSS measurement doesn't predict. Common patterns that spike memory inside a container:
- File-reading tools: a tool that reads a large file into memory allocates proportionally to the file size. If users can pass arbitrary file paths or URLs, memory usage is unbounded without an explicit limit in the tool implementation.
- Subprocess tools: tools that shell out (
child_process.exec,execa) inherit the container's memory limit. A tool that callsffmpegorpandocon large input can temporarily double memory usage. - Concurrent sessions: each active MCP session holds state in memory. N concurrent sessions times per-session memory is your baseline concurrent load.
Set Docker memory limits explicitly:
# docker-compose.yml
services:
mcp-server:
image: my-mcp-server:latest
mem_limit: 512m
memswap_limit: 512m # disables swap
cpus: "1.0"
env_file: .env
ports:
- "3000:3000"
restart: unless-stopped
Start with a limit 2–3× your measured baseline RSS, then tune based on actual usage. Out-of-memory kills appear in Docker logs as Killed with exit code 137. Monitor docker stats output during load tests to find your practical ceiling before you hit it in production.
docker-compose example with monitoring
A complete compose.yml for a production MCP server with Caddy as a TLS-terminating reverse proxy:
services:
mcp-server:
build: .
env_file: .env
mem_limit: 512m
memswap_limit: 512m
restart: unless-stopped
stop_grace_period: 35s
networks:
- internal
healthcheck:
test: ["CMD-SHELL", "wget -qO- --post-data='{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"initialize\",\"params\":{\"protocolVersion\":\"2024-11-05\",\"capabilities\":{},\"clientInfo\":{\"name\":\"hc\",\"version\":\"1\"}}}' --header='Content-Type: application/json' http://localhost:3000/mcp | grep -q protocolVersion || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
caddy:
image: caddy:2-alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile:ro
- caddy_data:/data
- caddy_config:/config
networks:
- internal
depends_on:
mcp-server:
condition: service_healthy
networks:
internal:
volumes:
caddy_data:
caddy_config:
The depends_on: condition: service_healthy ensures Caddy only starts routing traffic after the MCP server's HEALTHCHECK passes. The Caddyfile proxies mcp.yourdomain.com to http://mcp-server:3000 with automatic TLS.
External monitoring beyond the HEALTHCHECK
The Docker HEALTHCHECK tells Docker whether the container is healthy from inside the container network. It does not tell you whether the server is reachable from the public internet — a broken firewall rule, an expired TLS certificate, or a failed DNS record won't show up as a Docker health failure. AliveMCP probes your public endpoint from outside the container network every 60 seconds, verifying TLS, DNS, and the full MCP initialize sequence end-to-end. The two checks complement each other: internal health checks detect process-level failures; external probes detect network-level failures.
See MCP server uptime monitoring and MCP server TLS certificate monitoring for what AliveMCP monitors beyond the protocol handshake.
Related questions
Do I need a separate container for Caddy?
Not necessarily. You can run Caddy and the MCP server in a single container using a process supervisor like s6-overlay. But the two-container pattern (MCP server + Caddy) is easier to update independently — update the MCP server without touching the TLS config, or update Caddy for security patches without rebuilding the application image. Docker Compose makes two-container setups straightforward.
What base image should I use for a Python MCP server?
Use python:3.12-slim for most Python MCP servers. The full python:3.12 image includes many development tools that aren't needed at runtime. For servers that use compiled extensions (numpy, cryptography), slim is usually fine — the Debian slim base still has the necessary runtime libraries. Alpine has glibc compatibility issues with some Python packages; prefer slim over Alpine for Python.
How do I handle secrets that need to rotate without restarting the container?
Read secrets from environment variables at each tool invocation rather than caching them at startup. This way, updating a secret means updating the environment variable (via your secrets manager or docker compose up after modifying .env) and restarting only the container — not rebuilding the image. For zero-downtime secret rotation, implement a SIGHUP handler that re-reads secrets from the filesystem (from a mounted secrets volume updated by a sidecar).
My MCP server uses stdio transport — how do I wrap it for Docker?
If the SDK you're using supports both transports, add the HTTP/SSE transport and expose port 3000. If the implementation only supports stdio, use the @modelcontextprotocol/sdk's StdioServerTransport → SSEServerTransport bridge pattern, or switch to an SDK that supports both. Don't try to proxy stdio through a TCP tunnel inside a container — the complexity isn't worth it and the pattern doesn't generalize to horizontal scaling.
Further reading
- MCP server deployment — transport, probes, and rolling-restart safety
- MCP server on Kubernetes — readiness probes, PDBs, and autoscaling
- MCP server health checks — the full initialize probe sequence
- MCP server load testing — finding the session limit before degradation
- MCP server TLS certificate monitoring
- AliveMCP — external monitoring for your containerized MCP server