Guide · HTTP Frameworks

MCP server Fastify — high-throughput MCP HTTP transport with Fastify

Fastify's low-overhead architecture and schema-driven pipeline make it an excellent host for MCP servers that need to handle hundreds of concurrent tool-call sessions. The main integration hurdle is Fastify's body-parsing layer: the MCP SDK needs access to the raw Node.js IncomingMessage and ServerResponse, which requires bypassing Fastify's default serializers. This guide covers every step from raw body access to rate limiting to monitoring with AliveMCP.

TL;DR

Register a custom content-type parser with fastify.addContentTypeParser('application/json', ...) that passes the raw body buffer to the MCP SDK. Use reply.raw to give StreamableHTTPServerTransport direct access to the Node.js ServerResponse for SSE streaming. Hook up AliveMCP on your /health route to catch protocol-layer failures that Fastify's schema validation will never surface.

Project setup and Fastify plugin structure

Fastify uses a plugin-based encapsulation model. Wrapping the MCP transport logic in a fastify-plugin lets you compose it into larger applications without breaking Fastify's scope rules. Install all required packages:

npm init -y
npm install @modelcontextprotocol/sdk fastify @fastify/cors @fastify/rate-limit fastify-plugin uuid
npm install -D typescript @types/node tsx

Create the plugin skeleton. The fastify-plugin wrapper opts out of Fastify's context encapsulation so that routes registered inside the plugin are visible at the top-level scope — important when you want the health route and MCP route to share the same Fastify instance.

// src/mcp-plugin.ts
import fp from 'fastify-plugin';
import { FastifyInstance } from 'fastify';
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
import { isInitializeRequest } from '@modelcontextprotocol/sdk/types.js';
import { v4 as uuidv4 } from 'uuid';

export const mcpPlugin = fp(async (fastify: FastifyInstance) => {
  const sessions = new Map<string, StreamableHTTPServerTransport>();

  // Expose session count on the instance for health checks
  fastify.decorate('mcpSessionCount', () => sessions.size);

  await registerContentTypeParser(fastify);
  await registerRoutes(fastify, sessions);

  fastify.addHook('onClose', async () => {
    const closes = Array.from(sessions.values()).map((t) => t.close());
    await Promise.allSettled(closes);
  });
});

Decorating the Fastify instance with mcpSessionCount keeps the health endpoint decoupled from the plugin internals while still exposing the metric you need for operational visibility.

Raw body parsing and the reply.raw hijack pattern

Fastify's default body parser serializes the request body into a JavaScript object before your route handler sees it. The MCP SDK's handleRequest method accepts a pre-parsed object for POST requests, which works well — but it also needs the raw ServerResponse (res) to write SSE events. Fastify wraps res in its own Reply object; you access the underlying Node.js response via reply.raw.

async function registerContentTypeParser(fastify: FastifyInstance) {
  // Tell Fastify to parse application/json bodies as raw Buffer,
  // then hand the parsed object to the MCP SDK ourselves.
  fastify.addContentTypeParser(
    'application/json',
    { parseAs: 'string' },
    (req, body, done) => {
      try {
        done(null, JSON.parse(body as string));
      } catch (err) {
        done(err as Error);
      }
    }
  );
}

async function registerRoutes(
  fastify: FastifyInstance,
  sessions: Map<string, StreamableHTTPServerTransport>
) {
  fastify.post('/mcp', async (request, reply) => {
    const sessionId = request.headers['mcp-session-id'] as string | undefined;

    if (sessionId && sessions.has(sessionId)) {
      const transport = sessions.get(sessionId)!;
      // Pass reply.raw so the transport can write directly to the socket
      await transport.handleRequest(request.raw, reply.raw, request.body);
      return reply;
    }

    if (!isInitializeRequest(request.body)) {
      return reply.status(400).send({ error: 'Expected initialize request' });
    }

    const newId = uuidv4();
    const transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: () => newId,
      onsessioninitialized: (id) => { sessions.set(id, transport); },
    });
    transport.onclose = () => { sessions.delete(newId); };

    const server = createMcpServer();
    await server.connect(transport);
    await transport.handleRequest(request.raw, reply.raw, request.body);
    return reply;
  });

  fastify.get('/mcp', async (request, reply) => {
    const sessionId = request.headers['mcp-session-id'] as string | undefined;
    if (!sessionId || !sessions.has(sessionId)) {
      return reply.status(404).send({ error: 'Unknown session' });
    }
    await sessions.get(sessionId)!.handleRequest(request.raw, reply.raw);
    return reply;
  });

  fastify.delete('/mcp', async (request, reply) => {
    const sessionId = request.headers['mcp-session-id'] as string | undefined;
    if (sessionId && sessions.has(sessionId)) {
      await sessions.get(sessionId)!.close();
      sessions.delete(sessionId);
    }
    return reply.status(204).send();
  });
}

The key insight is passing request.raw and reply.raw rather than the Fastify wrappers. If you accidentally pass the Fastify reply object, the transport will fail to write SSE events because Fastify intercepts write() calls for its own serialization pipeline.

CORS and rate limiting with Fastify plugins

Fastify's plugin ecosystem covers both CORS and rate limiting with first-party packages. Register @fastify/cors before any route declarations and @fastify/rate-limit to protect your tool call endpoints from runaway clients.

// src/index.ts
import Fastify from 'fastify';
import cors from '@fastify/cors';
import rateLimit from '@fastify/rate-limit';
import { mcpPlugin } from './mcp-plugin.js';
import { createMcpServer } from './server.js';

const fastify = Fastify({ logger: true });

await fastify.register(cors, {
  origin: process.env.ALLOWED_ORIGINS?.split(',') ?? true,
  methods: ['GET', 'POST', 'DELETE', 'OPTIONS'],
  allowedHeaders: ['Content-Type', 'Mcp-Session-Id', 'Authorization'],
  exposedHeaders: ['Mcp-Session-Id'],
  credentials: true,
});

await fastify.register(rateLimit, {
  global: false, // apply per-route
});

// Apply rate limit specifically to the MCP tool-call route
fastify.post('/mcp', {
  config: {
    rateLimit: {
      max: 60,          // 60 tool calls per window
      timeWindow: '1 minute',
      keyGenerator: (req) =>
        (req.headers['mcp-session-id'] as string) ?? req.ip,
    },
  },
});

await fastify.register(mcpPlugin);

// Health endpoint — probed by AliveMCP every 60 seconds
fastify.get('/health', async (request, reply) => {
  return {
    status: 'ok',
    sessions: fastify.mcpSessionCount(),
    uptime: process.uptime(),
    ts: Date.now(),
  };
});

await fastify.listen({ port: Number(process.env.PORT ?? 3000), host: '0.0.0.0' });
console.log('MCP Fastify server started');

Rate limiting by Mcp-Session-Id rather than IP address is more accurate for MCP servers: a single NAT gateway could share an IP across hundreds of legitimate sessions, while a runaway script flooding tool calls will have a distinctive session ID. See the MCP server rate limiting guide for strategies on adaptive limits and per-tool quotas.

Request observability with Fastify hooks

Fastify's lifecycle hooks give you a clean place to add structured logging and metrics without polluting route handler code. An onRequest hook fires before body parsing, making it ideal for logging MCP session IDs. An onSend hook fires just before the response is written, useful for capturing status codes and response times.

fastify.addHook('onRequest', async (request) => {
  const sessionId = request.headers['mcp-session-id'];
  if (sessionId) {
    request.log.info({ sessionId, method: request.method, url: request.url }, 'MCP request');
  }
});

fastify.addHook('onSend', async (request, reply, payload) => {
  const sessionId = request.headers['mcp-session-id'];
  if (sessionId) {
    request.log.info(
      { sessionId, statusCode: reply.statusCode, ms: reply.elapsedTime },
      'MCP response'
    );
  }
  return payload;
});

// Parse MCP method name from request body for fine-grained logging
fastify.addHook('preHandler', async (request) => {
  const body = request.body as Record<string, unknown> | undefined;
  if (body?.method) {
    request.log.info({ mcpMethod: body.method }, 'MCP method dispatched');
  }
});

Fastify's native JSON schema validation catches malformed HTTP request bodies, but it has no knowledge of the MCP protocol layer. A request body that is valid JSON but contains an unknown MCP method or a missing jsonrpc field will pass Fastify's validation and only fail inside the SDK. That's why AliveMCP probes the MCP protocol layer directly — sending real MCP initialize requests and checking for valid JSON-RPC responses — rather than just checking for HTTP 200.

Graceful shutdown and tool registration

Fastify's fastify.close() method triggers all registered onClose hooks, which is where we placed the session cleanup logic in the plugin. Combine this with OS signal handling to ensure clean shutdowns during deployments.

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';

export function createMcpServer(): McpServer {
  const server = new McpServer({
    name: 'my-fastify-mcp',
    version: '1.0.0',
  });

  server.tool(
    'ping',
    'Check server reachability',
    {},
    async () => ({
      content: [{ type: 'text', text: `pong at ${new Date().toISOString()}` }],
    })
  );

  server.tool(
    'fetch_data',
    'Fetches data from an internal API',
    { endpoint: z.string().url() },
    async ({ endpoint }) => {
      const response = await fetch(endpoint);
      const text = await response.text();
      return { content: [{ type: 'text', text }] };
    }
  );

  return server;
}

// Graceful shutdown
const shutdown = async (signal: string) => {
  fastify.log.info(`${signal} received, shutting down`);
  await fastify.close();
  process.exit(0);
};

process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT',  () => shutdown('SIGINT'));

During a rolling deployment, Fastify's close() hook will close all active MCP sessions before the process exits. AliveMCP will detect the health endpoint going offline and immediately notify your on-call channel. If the new instance doesn't come up healthy within your configured timeout, the alert escalates — giving you a clear signal that the deployment failed rather than leaving you to discover errors in client logs hours later. For containerized deployments, see the MCP server Docker guide and the authentication guide for securing Fastify MCP servers.

Frequently asked questions

Why does Fastify's schema validation break my MCP server?

If you define a JSON schema on the POST /mcp route using Fastify's schema option, Fastify will reject any request body that doesn't match — including valid MCP requests that contain fields your schema doesn't anticipate. The safest approach is to omit the schema option on MCP routes entirely and let the MCP SDK handle validation. You can still validate the outer HTTP structure (e.g., ensuring the body is an object with a jsonrpc field) without specifying the full MCP schema, which changes between protocol versions.

How do I handle Fastify's 30-second default request timeout for long-running tool calls?

Fastify inherits Node.js's default socket timeout. For long-running MCP tool calls, set connectionTimeout and keepAliveTimeout when constructing the Fastify instance, and also configure request.raw.setTimeout(0) inside your route handler to disable the per-request timeout. Alternatively, design long-running tools to return immediately with a resource URI and have the client poll for results, keeping individual HTTP requests short. The GET /mcp SSE channel is already a long-lived connection — make sure your infrastructure (load balancer, nginx) has long idle timeouts on that route specifically.

Can I use Fastify's built-in Pino logger to trace MCP tool calls end-to-end?

Yes. Fastify uses Pino for structured logging by default, so every log line includes a reqId that correlates all hooks within a single HTTP request. Add the mcp-session-id header value to your log context with a preHandler hook using request.log.child({ sessionId }). This gives you a trace from the incoming POST through to the tool handler response, which is invaluable for debugging intermittent failures that only appear under concurrent load.

What's the performance difference between Fastify and Express for MCP workloads?

Fastify is typically 2–3x faster than Express on raw HTTP throughput benchmarks, primarily because of its schema-based serialization pipeline that avoids runtime type checks. For MCP workloads where each tool call involves external I/O (database queries, API calls), the difference at the HTTP layer is usually negligible compared to the tool execution time. However, Fastify's lower per-request overhead matters significantly when you have thousands of concurrent SSE connections each sending keep-alive events, which is where Fastify's efficient event-loop utilization becomes a real advantage.

How does AliveMCP detect protocol-layer failures that Fastify's validation misses?

AliveMCP doesn't just send a GET request to /health — it can be configured to send a real MCP initialize request to your /mcp endpoint and verify the response is a valid JSON-RPC initialize result with the correct protocol version. This catches failures where the HTTP server is running and returning 200 but the MCP SDK layer has crashed, the session store is corrupted, or a tool registration error prevents the server from completing initialization. Fastify's schema validation only checks the JSON structure; it has no awareness of MCP semantics. Configure deep MCP protocol checks at alivemcp.com.