Guide · Performance

MCP server worker threads

Node.js runs tool handler code on a single event-loop thread. An async function that calls await someDatabase.query() yields to the event loop while waiting for I/O — but an async function that runs a CPU-intensive computation (image transcoding, PDF generation, bcrypt hashing, ML inference) blocks the entire event loop thread until that computation returns. Every other pending tool call waits. The fix is worker threads: move the CPU-intensive work to a separate OS thread, return the result to the main thread as a Promise, and let the event loop continue handling other requests while the worker runs.

TL;DR

Install piscina — a managed Node.js worker thread pool. Create a worker file that exports the CPU-intensive function. In your MCP handler, call pool.run(args) instead of calling the function directly. pool.run() returns a Promise that resolves when the worker completes. The main event loop thread is free to handle other tool calls while the worker runs. Use pool.destroy() in graceful shutdown to terminate all workers cleanly.

When to use worker threads in an MCP server

The rule is simple: if a synchronous operation takes more than a few milliseconds on the main thread, it blocks every other tool call for that duration. Identify these operations with CPU profiling — any function that appears wide in a flame graph under a tool handler is a candidate.

OperationTypical durationUse worker thread?
bcrypt / argon2 hash (cost=12)200–600msYes — always
PDF generation (puppeteer/pdfkit)500ms–5sYes — always
Image resize / transcode (sharp)50–500msYes — sharp uses libuv workers internally, but CPU-bound operations in JS wrappers still need worker threads
JSON.parse on large payload (>1MB)5–50msConsider — profile first
Regex on large or untrusted input1ms–∞ (ReDoS)Yes — isolates catastrophic backtracking
Database query (postgres, sqlite)0.1–100msNo — I/O-bound, already async
HTTP fetch to external API50–2000msNo — I/O-bound, already async
Zod validation on small schema<1msNo — not worth the overhead

Detecting event loop blocking

Before reaching for worker threads, confirm that your handler is actually blocking the event loop. The simplest test: run two concurrent tool calls and verify the second one doesn't wait for the first to finish.

// test/blocking.test.ts — detects event loop blocking in tool handlers
import { describe, it, expect } from 'vitest';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';
import { createServer } from '../src/server.js';

it('concurrent tool calls are not serialized by event loop blocking', async () => {
  const [serverTransport, clientTransport] = InMemoryTransport.createLinkedPair();
  const server = createServer();
  await server.connect(serverTransport);
  const client = new Client({ name: 'test', version: '1.0.0' }, { capabilities: {} });
  await client.connect(clientTransport);

  // If the slow handler blocks the event loop, these will serialize
  // (total time ≈ 2 × single call time).
  // If non-blocking, they run concurrently (total time ≈ single call time).
  const start = performance.now();
  await Promise.all([
    client.callTool({ name: 'cpu_intensive_tool', arguments: { size: 100 } }),
    client.callTool({ name: 'fast_tool', arguments: {} }),
  ]);
  const totalMs = performance.now() - start;

  // If cpu_intensive_tool takes 200ms and blocks, fast_tool also waits 200ms.
  // Serial total ≈ 400ms. Concurrent total ≈ 200ms.
  // Threshold: if total is more than 1.5× the expected blocking time, it's blocking.
  expect(totalMs).toBeLessThan(300); // adjust to your expected handler time
});

piscina — managed worker thread pool

piscina manages a pool of worker threads, queues tasks, handles errors and crashes, and supports task cancellation via AbortController. It is the recommended way to use worker threads in Node.js MCP servers because it handles the boilerplate of pool sizing, message passing, and worker lifecycle.

npm install piscina
// src/workers/hash-worker.ts — worker file
// This file runs inside the worker thread.
// It must export a default function (or export named functions for piscina).
import bcrypt from 'bcrypt';

export default async function hashPassword(
  args: { password: string; rounds: number }
): Promise<string> {
  return bcrypt.hash(args.password, args.rounds);
}
// src/server.ts — main thread
import Piscina from 'piscina';
import { fileURLToPath } from 'url';
import path from 'path';
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';

const __dirname = path.dirname(fileURLToPath(import.meta.url));

// Create the pool once at module level — not inside the handler
const hashPool = new Piscina({
  filename: path.join(__dirname, 'workers/hash-worker.js'), // compiled JS
  minThreads: 1,
  maxThreads: 4, // tune to (CPU cores - 1) to leave one for the event loop
  idleTimeout: 30_000, // terminate idle workers after 30s
});

const server = new Server(
  { name: 'auth-mcp', version: '1.0.0' },
  { capabilities: { tools: {} } }
);

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: 'hash_password',
      description: 'Hash a password with bcrypt. Use this before storing a new password.',
      inputSchema: {
        type: 'object',
        properties: {
          password: { type: 'string', description: 'Plaintext password to hash' },
          rounds: { type: 'number', description: 'bcrypt cost factor (10–14 typical)', default: 12 },
        },
        required: ['password'],
      },
    },
  ],
}));

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === 'hash_password') {
    const { password, rounds = 12 } = request.params.arguments as {
      password: string;
      rounds?: number;
    };

    // pool.run() returns a Promise — the event loop is free while the worker hashes
    const hash = await hashPool.run({ password, rounds });

    return {
      content: [{ type: 'text', text: JSON.stringify({ hash }) }],
    };
  }
  throw new Error(`Unknown tool: ${request.params.name}`);
});

Task cancellation with AbortController

Long-running worker tasks should support cancellation. Piscina accepts an AbortSignal — if the signal fires before the task starts, the task is dropped from the queue. If the signal fires while the task is running, the worker receives a workerData.signal (you must implement the cancellation logic inside the worker).

// Cancel a queued worker task if the MCP client disconnects
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const controller = new AbortController();

  // Set a per-task timeout
  const timeout = setTimeout(() => controller.abort(), 10_000);

  try {
    const result = await hashPool.run(
      { password: request.params.arguments.password },
      { signal: controller.signal }
    );
    return { content: [{ type: 'text', text: JSON.stringify({ hash: result }) }] };
  } catch (err) {
    if (controller.signal.aborted) {
      return { content: [{ type: 'text', text: 'Hash operation timed out' }], isError: true };
    }
    return { content: [{ type: 'text', text: `Hash failed: ${String(err)}` }], isError: true };
  } finally {
    clearTimeout(timeout);
  }
});

SharedArrayBuffer for zero-copy large data

By default, data passed between the main thread and a worker is serialized (structured clone) — copying the data. For large payloads (image buffers, large arrays), this copy is expensive. SharedArrayBuffer lets both threads read and write the same memory without copying.

// Pass a large Buffer to a worker without copying
const imageBuffer = await fs.readFile('/path/to/image.jpg');

// Allocate shared memory of the same size
const sharedBuffer = new SharedArrayBuffer(imageBuffer.byteLength);
const shared = new Uint8Array(sharedBuffer);
shared.set(new Uint8Array(imageBuffer.buffer));

// Pass the SharedArrayBuffer — no copy occurs
const result = await imagePool.run({ sharedBuffer, width: 800, height: 600 });

// In the worker:
// export default function resize({ sharedBuffer, width, height }) {
//   const input = Buffer.from(sharedBuffer);
//   return sharp(input).resize(width, height).toBuffer();
// }

Use SharedArrayBuffer only when benchmarks confirm that serialization is a meaningful bottleneck. It adds complexity (shared-memory race conditions are possible if multiple workers write simultaneously) and requires a secure context in browsers (not relevant for server-side, but worth noting).

Graceful worker shutdown

Worker threads do not automatically terminate when the main process exits. Call pool.destroy() in your graceful shutdown handler to terminate all workers cleanly and drain the queue.

// Graceful shutdown: close the server transport first, then destroy worker pools
async function shutdown() {
  await server.close();         // stop accepting new MCP requests
  await hashPool.destroy();     // drain queue + terminate workers
  process.exit(0);
}

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);

If a worker is mid-task when destroy() is called, piscina waits for it to finish (up to a configurable destroyTimeout). Set destroyTimeout to match your longest expected task time plus a safety margin.

Error handling in workers

Worker thread errors propagate to the main thread as rejected Promises from pool.run(). Catch them in the tool handler and return isError: true responses — do not let them propagate to the MCP SDK as uncaught rejections, which would produce a JSON-RPC -32603 that LLM clients cannot recover from.

try {
  const result = await hashPool.run({ password });
  return { content: [{ type: 'text', text: JSON.stringify({ hash: result }) }] };
} catch (err) {
  // Worker threw — log for diagnostics, return isError so LLM can retry
  console.error(JSON.stringify({ event: 'worker_error', tool: 'hash_password', err: String(err) }));
  return {
    content: [{ type: 'text', text: 'Password hashing failed. Please retry.' }],
    isError: true,
  };
}

Worker crashes (segfault, out-of-memory inside the worker) cause piscina to terminate and replace the worker thread automatically. The pending task fails with an error, which your handler catches and converts to an isError: true response.

Related questions

How many worker threads should I create?

Start with maxThreads = Math.max(1, os.cpus().length - 1) — leave one CPU for the event loop. Worker threads compete with the event loop for CPU time; too many workers starve the event loop and increase context-switching overhead. For I/O-bound work (network calls inside a worker), you can have more threads than CPUs because they will frequently be waiting. For CPU-bound work (hashing, image processing), match thread count to CPU count. Monitor CPU core utilization and p99 latency under load with benchmarking to find the right number for your workload.

Can I use ES modules in worker files?

Yes, since Node.js 12.11.0. Pass the worker file path to piscina with the compiled .js extension (or .mjs for explicit ESM). If you're using TypeScript, compile the worker file separately or use a path that resolves to the compiled output. Piscina supports both CommonJS and ESM workers — it detects the module type from the file extension or type: 'module' in package.json.

Should I use worker threads or child processes?

Worker threads (via worker_threads module or piscina) are the right choice for CPU-intensive work that shares memory with the main process — they share the same memory space, start faster than processes (no fork overhead), and support SharedArrayBuffer. Child processes (child_process.fork()) are better for running external programs or for complete isolation where a crash in the child should not affect the parent. For MCP servers, worker threads are almost always the right choice for CPU offloading.

Does Sharp (image processing) need worker threads?

Sharp uses native bindings (libvips) that run in libuv's thread pool, so basic image operations do not block the Node.js event loop by themselves. However, the JavaScript wrapper code (constructing the Sharp pipeline, processing results) does run on the event loop. For very high-throughput image-processing MCP tools, offloading to a worker thread isolates the full Sharp pipeline — JS wrapper and all — from your event loop. Profile with 0x first to confirm the wrapper overhead is significant before adding worker-thread complexity.

Further reading