Guide · Performance

MCP server worker threads

Node.js runs tool handler code on a single event-loop thread. An async function that calls await someDatabase.query() yields to the event loop while waiting for I/O — but an async function that runs a CPU-intensive computation (image transcoding, PDF generation, bcrypt hashing, ML inference) blocks the entire event loop thread until that computation returns. Every other pending tool call waits. The fix is worker threads: move the CPU-intensive work to a separate OS thread, return the result to the main thread as a Promise, and let the event loop continue handling other requests while the worker runs.

TL;DR

Install piscina — a managed Node.js worker thread pool. Create a worker file that exports the CPU-intensive function. In your MCP handler, call pool.run(args) instead of calling the function directly. pool.run() returns a Promise that resolves when the worker completes. The main event loop thread is free to handle other tool calls while the worker runs. Use pool.destroy() in graceful shutdown to terminate all workers cleanly.

When to use worker threads in an MCP server

The rule is simple: if a synchronous operation takes more than a few milliseconds on the main thread, it blocks every other tool call for that duration. Identify these operations with CPU profiling — any function that appears wide in a flame graph under a tool handler is a candidate.

Operation	Typical duration	Use worker thread?
bcrypt / argon2 hash (cost=12)	200–600ms	Yes — always
PDF generation (puppeteer/pdfkit)	500ms–5s	Yes — always
Image resize / transcode (sharp)	50–500ms	Yes — sharp uses libuv workers internally, but CPU-bound operations in JS wrappers still need worker threads
JSON.parse on large payload (>1MB)	5–50ms	Consider — profile first
Regex on large or untrusted input	1ms–∞ (ReDoS)	Yes — isolates catastrophic backtracking
Database query (postgres, sqlite)	0.1–100ms	No — I/O-bound, already async
HTTP fetch to external API	50–2000ms	No — I/O-bound, already async
Zod validation on small schema	<1ms	No — not worth the overhead

Detecting event loop blocking

Before reaching for worker threads, confirm that your handler is actually blocking the event loop. The simplest test: run two concurrent tool calls and verify the second one doesn't wait for the first to finish.

// test/blocking.test.ts — detects event loop blocking in tool handlers
import { describe, it, expect } from 'vitest';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';
import { createServer } from '../src/server.js';

it('concurrent tool calls are not serialized by event loop blocking', async () => {
  const [serverTransport, clientTransport] = InMemoryTransport.createLinkedPair();
  const server = createServer();
  await server.connect(serverTransport);
  const client = new Client({ name: 'test', version: '1.0.0' }, { capabilities: {} });
  await client.connect(clientTransport);

  // If the slow handler blocks the event loop, these will serialize
  // (total time ≈ 2 × single call time).
  // If non-blocking, they run concurrently (total time ≈ single call time).
  const start = performance.now();
  await Promise.all([
    client.callTool({ name: 'cpu_intensive_tool', arguments: { size: 100 } }),
    client.callTool({ name: 'fast_tool', arguments: {} }),
  ]);
  const totalMs = performance.now() - start;

  // If cpu_intensive_tool takes 200ms and blocks, fast_tool also waits 200ms.
  // Serial total ≈ 400ms. Concurrent total ≈ 200ms.
  // Threshold: if total is more than 1.5× the expected blocking time, it's blocking.
  expect(totalMs).toBeLessThan(300); // adjust to your expected handler time
});

piscina — managed worker thread pool

piscina manages a pool of worker threads, queues tasks, handles errors and crashes, and supports task cancellation via AbortController. It is the recommended way to use worker threads in Node.js MCP servers because it handles the boilerplate of pool sizing, message passing, and worker lifecycle.

npm install piscina

// src/workers/hash-worker.ts — worker file
// This file runs inside the worker thread.
// It must export a default function (or export named functions for piscina).
import bcrypt from 'bcrypt';

export default async function hashPassword(
  args: { password: string; rounds: number }
): Promise<string> {
  return bcrypt.hash(args.password, args.rounds);
}

// src/server.ts — main thread
import Piscina from 'piscina';
import { fileURLToPath } from 'url';
import path from 'path';
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';

const __dirname = path.dirname(fileURLToPath(import.meta.url));

// Create the pool once at module level — not inside the handler
const hashPool = new Piscina({
  filename: path.join(__dirname, 'workers/hash-worker.js'), // compiled JS
  minThreads: 1,
  maxThreads: 4, // tune to (CPU cores - 1) to leave one for the event loop
  idleTimeout: 30_000, // terminate idle workers after 30s
});

const server = new Server(
  { name: 'auth-mcp', version: '1.0.0' },
  { capabilities: { tools: {} } }
);

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: 'hash_password',
      description: 'Hash a password with bcrypt. Use this before storing a new password.',
      inputSchema: {
        type: 'object',
        properties: {
          password: { type: 'string', description: 'Plaintext password to hash' },
          rounds: { type: 'number', description: 'bcrypt cost factor (10–14 typical)', default: 12 },
        },
        required: ['password'],
      },
    },
  ],
}));

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === 'hash_password') {
    const { password, rounds = 12 } = request.params.arguments as {
      password: string;
      rounds?: number;
    };

    // pool.run() returns a Promise — the event loop is free while the worker hashes
    const hash = await hashPool.run({ password, rounds });

    return {
      content: [{ type: 'text', text: JSON.stringify({ hash }) }],
    };
  }
  throw new Error(`Unknown tool: ${request.params.name}`);
});

Task cancellation with AbortController

Long-running worker tasks should support cancellation. Piscina accepts an AbortSignal — if the signal fires before the task starts, the task is dropped from the queue. If the signal fires while the task is running, the worker receives a workerData.signal (you must implement the cancellation logic inside the worker).

// Cancel a queued worker task if the MCP client disconnects
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const controller = new AbortController();

  // Set a per-task timeout
  const timeout = setTimeout(() => controller.abort(), 10_000);

  try {
    const result = await hashPool.run(
      { password: request.params.arguments.password },
      { signal: controller.signal }
    );
    return { content: [{ type: 'text', text: JSON.stringify({ hash: result }) }] };
  } catch (err) {
    if (controller.signal.aborted) {
      return { content: [{ type: 'text', text: 'Hash operation timed out' }], isError: true };
    }
    return { content: [{ type: 'text', text: `Hash failed: ${String(err)}` }], isError: true };
  } finally {
    clearTimeout(timeout);
  }
});

SharedArrayBuffer for zero-copy large data

By default, data passed between the main thread and a worker is serialized (structured clone) — copying the data. For large payloads (image buffers, large arrays), this copy is expensive. SharedArrayBuffer lets both threads read and write the same memory without copying.

// Pass a large Buffer to a worker without copying
const imageBuffer = await fs.readFile('/path/to/image.jpg');

// Allocate shared memory of the same size
const sharedBuffer = new SharedArrayBuffer(imageBuffer.byteLength);
const shared = new Uint8Array(sharedBuffer);
shared.set(new Uint8Array(imageBuffer.buffer));

// Pass the SharedArrayBuffer — no copy occurs
const result = await imagePool.run({ sharedBuffer, width: 800, height: 600 });

// In the worker:
// export default function resize({ sharedBuffer, width, height }) {
//   const input = Buffer.from(sharedBuffer);
//   return sharp(input).resize(width, height).toBuffer();
// }

Use SharedArrayBuffer only when benchmarks confirm that serialization is a meaningful bottleneck. It adds complexity (shared-memory race conditions are possible if multiple workers write simultaneously) and requires a secure context in browsers (not relevant for server-side, but worth noting).

Graceful worker shutdown

Worker threads do not automatically terminate when the main process exits. Call pool.destroy() in your graceful shutdown handler to terminate all workers cleanly and drain the queue.

// Graceful shutdown: close the server transport first, then destroy worker pools
async function shutdown() {
  await server.close();         // stop accepting new MCP requests
  await hashPool.destroy();     // drain queue + terminate workers
  process.exit(0);
}

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);

If a worker is mid-task when destroy() is called, piscina waits for it to finish (up to a configurable destroyTimeout). Set destroyTimeout to match your longest expected task time plus a safety margin.

Error handling in workers

Worker thread errors propagate to the main thread as rejected Promises from pool.run(). Catch them in the tool handler and return isError: true responses — do not let them propagate to the MCP SDK as uncaught rejections, which would produce a JSON-RPC -32603 that LLM clients cannot recover from.

try {
  const result = await hashPool.run({ password });
  return { content: [{ type: 'text', text: JSON.stringify({ hash: result }) }] };
} catch (err) {
  // Worker threw — log for diagnostics, return isError so LLM can retry
  console.error(JSON.stringify({ event: 'worker_error', tool: 'hash_password', err: String(err) }));
  return {
    content: [{ type: 'text', text: 'Password hashing failed. Please retry.' }],
    isError: true,
  };
}

Worker crashes (segfault, out-of-memory inside the worker) cause piscina to terminate and replace the worker thread automatically. The pending task fails with an error, which your handler catches and converts to an isError: true response.