Guide · Webhooks

MCP server webhook

MCP servers interact with webhooks in two distinct roles: as a sender — firing outbound HTTP POST notifications from tool handlers when significant events occur — and as a receiver — exposing a separate HTTP endpoint that third-party services call to trigger tool execution or push data into the server. Both patterns are common in production MCP servers. The key architectural rule is that webhook logic lives outside the MCP session layer: outbound delivery happens after the tool handler returns its result; inbound delivery happens at a plain HTTP endpoint that never touches the initialize handshake.

TL;DR

For outbound webhooks from tool handlers: fire-and-forget with a durable retry queue (never await the HTTP call inside the handler — this blocks the MCP session and inflates probe latency). For inbound webhooks that trigger tool logic: expose a separate /webhook HTTP route, verify the HMAC signature before processing, and enqueue work rather than processing synchronously. AliveMCP monitors your MCP server's initialize endpoint — if outbound delivery failures are causing tool handlers to time out, the resulting latency spike will show up in probe metrics before users notice.

Outbound webhooks from tool handlers

A common pattern is notifying external systems when a tool call completes — for example, posting a Slack message when a document is processed, or calling a user-supplied callback URL when a long-running job finishes. The critical rule: never await the webhook delivery inside the tool handler. Blocking on an outbound HTTP call inflates tool-call latency and, if the target is slow or down, can cause the MCP session to time out.

The correct pattern is fire-and-forget with a background queue:

// webhook-queue.ts — simple in-process retry queue
import { setTimeout as sleep } from 'node:timers/promises';
import { createHmac } from 'node:crypto';

interface WebhookJob {
  url: string;
  payload: unknown;
  secret: string;
  attempts: number;
}

const queue: WebhookJob[] = [];
let running = false;

export function enqueueWebhook(url: string, payload: unknown, secret: string) {
  queue.push({ url, payload, secret, attempts: 0 });
  if (!running) drainQueue();
}

async function drainQueue() {
  running = true;
  while (queue.length > 0) {
    const job = queue.shift()!;
    await deliverWithRetry(job);
  }
  running = false;
}

async function deliverWithRetry(job: WebhookJob) {
  const body = JSON.stringify(job.payload);
  const sig = createHmac('sha256', job.secret).update(body).digest('hex');

  for (let attempt = 0; attempt < 5; attempt++) {
    try {
      const res = await fetch(job.url, {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          'X-Webhook-Signature': `sha256=${sig}`,
          'X-Webhook-Attempt': String(attempt + 1),
        },
        body,
        signal: AbortSignal.timeout(10_000),
      });
      if (res.ok) return;
      // 4xx = permanent failure (bad URL, auth, schema mismatch) — do not retry
      if (res.status >= 400 && res.status < 500) {
        console.error({ event: 'webhook_permanent_failure', url: job.url, status: res.status });
        return;
      }
    } catch (err) {
      // Network error — retry with exponential backoff
    }
    await sleep(Math.min(1000 * 2 ** attempt, 30_000));
  }
  console.error({ event: 'webhook_exhausted', url: job.url });
}

In the tool handler, call enqueueWebhook after preparing the result and return immediately. The queue drains asynchronously without blocking the MCP session:

server.tool(
  'process_document',
  'Process a document and notify the callback URL when done',
  {
    document_url: z.string().url(),
    callback_url: z.string().url().optional(),
  },
  async (args) => {
    const result = await processDocument(args.document_url);

    // Enqueue outbound webhook — does NOT await delivery
    if (args.callback_url) {
      enqueueWebhook(args.callback_url, {
        event: 'document.processed',
        document_url: args.document_url,
        result,
        timestamp: new Date().toISOString(),
      }, process.env.WEBHOOK_SECRET!);
    }

    return { content: [{ type: 'text', text: JSON.stringify(result) }] };
  }
);

HMAC signature verification

Every outbound webhook should include an HMAC signature so receivers can verify the payload came from your server. Use sha256 over the raw request body with a shared secret. The receiver computes the expected signature and compares it in constant time:

// Receiver-side signature verification (Express example)
import { createHmac, timingSafeEqual } from 'node:crypto';

function verifyWebhookSignature(
  rawBody: Buffer,
  signatureHeader: string,
  secret: string
): boolean {
  const expected = 'sha256=' + createHmac('sha256', secret).update(rawBody).digest('hex');
  const received = signatureHeader;
  if (expected.length !== received.length) return false;
  return timingSafeEqual(Buffer.from(expected), Buffer.from(received));
}

app.post('/webhook/callback', express.raw({ type: 'application/json' }), (req, res) => {
  const sig = req.headers['x-webhook-signature'] as string;
  if (!sig || !verifyWebhookSignature(req.body, sig, process.env.WEBHOOK_SECRET!)) {
    return res.status(401).json({ error: 'invalid signature' });
  }
  const payload = JSON.parse(req.body.toString());
  // Process the verified payload
  res.status(200).json({ received: true });
});

Three signature mistakes to avoid: (1) verifying against the parsed JSON body instead of the raw bytes — JSON re-serialization can change whitespace and byte order; (2) using string equality (===) instead of timingSafeEqual — timing attacks can extract the secret character by character; (3) using the same shared secret for multiple webhook consumers — rotate secrets per consumer so a compromised consumer does not expose all consumers.

Receiving inbound webhooks to trigger tool logic

Some MCP servers receive webhooks from external services — for example, a GitHub webhook that triggers a tool to run CI checks, or a Stripe webhook that updates a billing record. The inbound webhook endpoint is a plain Express route completely separate from the MCP transport — it never goes through the initialize handshake. This means inbound webhook processing does not affect probe metrics (AliveMCP probes /mcp, not /webhook) and webhook failures do not appear in session-level errors.

// server.ts — separate webhook route alongside MCP transport
import express from 'express';
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
import { verifyWebhookSignature } from './webhook-verify.js';
import { webhookEventQueue } from './event-queue.js';

const app = express();

// MCP transport route
app.post('/mcp', async (req, res) => {
  const server = new McpServer({ name: 'my-server', version: '1.0.0' });
  const transport = new StreamableHTTPServerTransport({ sessionIdHeader: 'mcp-session-id' });
  await server.connect(transport);
  await transport.handleRequest(req, res);
});

// Inbound webhook route — completely separate from MCP
app.post('/webhook/github', express.raw({ type: 'application/json' }), async (req, res) => {
  const sig = req.headers['x-hub-signature-256'] as string;
  if (!verifyWebhookSignature(req.body, sig, process.env.GITHUB_WEBHOOK_SECRET!)) {
    return res.status(401).end();
  }
  // Respond immediately — GitHub retries if you take > 10 seconds
  res.status(200).json({ received: true });

  // Process asynchronously after responding
  const event = JSON.parse(req.body.toString());
  await webhookEventQueue.push({ source: 'github', event });
});

Always respond to inbound webhooks within the platform's timeout (GitHub: 10s, Stripe: 30s, most others: 5-30s). Enqueue the work and process it after sending the response. If your processing logic is synchronous and fast, respond first with 200, then process — the platform retries on non-2xx responses, so an accidental timeout is worse than missing an event.

Retry policy design

The retry policy for outbound webhooks should be proportional to the impact of missed deliveries. Use exponential backoff with jitter to avoid thundering-herd retries when a downstream system comes back online after an outage:

Attempt	Delay	Cumulative wait
1 (immediate)	—	0s
2	1s + jitter(0-500ms)	~1s
3	2s + jitter	~3s
4	4s + jitter	~7s
5	8s + jitter	~15s

For production workloads where missed webhooks are a business problem, replace the in-process queue with a durable queue (Redis + BullMQ, or a managed queue service). The in-process queue above is fine for low-stakes notifications — it survives transient network blips but loses jobs on process restart. A durable queue survives restarts and lets you inspect failed jobs.

AliveMCP webhook alerts for your MCP server

AliveMCP's downtime alerting sends a webhook POST to a URL you configure whenever your server transitions between up and down. This is the cleanest way to wire MCP server downtime into your existing incident workflow without polling the AliveMCP dashboard. The payload includes the server slug, current status, previous status, downtime start time, and a link to the status page. You can pipe this into PagerDuty, Slack, or your own incident-management endpoint using the same signature verification pattern shown above.