Protocol guide · 2026-06-14 · MCP server protocol surface

Beyond Tool Calls: MCP's Full Protocol Surface

Most introductions to MCP center on the same three-step model: the client sends tools/call, the server handler runs, the handler returns a result. That model is accurate — and incomplete. MCP servers do considerably more. A long-running tool sends intermediate progress updates while it executes. A client that changes its mind cancels the call mid-flight and expects the server to roll back any partial state. A screenshot tool returns base64-encoded image bytes, not text. Each TCP connection has a lifecycle with initialization, reconnection, and cleanup distinct from any individual tool call. And a single MCP endpoint may itself aggregate tools from multiple child servers behind a proxy layer. This post synthesizes five deep-dives — progress notifications, cancellation, binary content, session lifecycle, and multi-server aggregation — into a unified picture of what MCP servers do when they are not simply executing and returning.

The five protocol capabilities at a glance

Each of the five capabilities extends the protocol surface in a different direction. Together they describe the full lifecycle of an MCP session: from connection establishment through active tool execution to teardown — and beyond a single server to composed systems.

Capability	What it adds	Key implementation requirement	Monitoring blind spot
Progress notifications	Intermediate status updates during a running tool call	Check `params._meta?.progressToken`; only send if present; rate-limit to ~500ms; send final notification on error too	A bug in the notification loop silences progress without breaking the tool's return value — the tool still succeeds but the client's progress bar never moves
Cancellation	Client-initiated interruption of an in-flight tool call	Read `extra.signal`; propagate to `fetch()` and database queries; roll back writes in `finally`; return clean non-error result on abort	Leaking database connections and unreleased file locks only appear under load; a healthy probe never sends a cancellation signal
Binary content	Non-text tool results: images, PDFs, and arbitrary files	Return `{ type: 'image', data: base64, mimeType: 'image/png' }`; send text description before image for LLM context; thumbnail large images	A misconfigured image pipeline returns a broken base64 string that passes content-length checks but displays as a corrupted image — invisible to any probe that doesn't decode the payload
Session lifecycle	Stateful connection with initialization, reconnection, and cleanup phases	Pair every `sessionContextMap.set()` with a `transport.onclose` delete; TTL eviction for zombie sessions; `MAX_SESSIONS` cap	Zombie sessions accumulate memory silently; a healthy probe never triggers an `onclose` with a missing delete
Multi-server aggregation	A single MCP endpoint composing tools from multiple child servers	Namespace tool names with child prefix (`github__search_repos`); use `Promise.allSettled` at startup; return `isError: true` on child failure	One child server going down silently disables its tools while the aggregator's own `initialize` path stays healthy — standard health checks show green for a partially broken system

Progress notifications: the side channel during execution

A tool call is synchronous from the client's perspective. The client sends tools/call and waits. For tools that complete in under a second this is invisible. For tools that run for 10, 30, or 120 seconds — a bulk data export, an LLM-assisted analysis pass, a database migration dry-run — the client has no signal distinguishing "running normally" from "hung and will never return."

Progress notifications solve this with an explicitly opt-in side channel. A client that wants updates includes a _meta.progressToken field in the tools/call request. Clients that omit it receive nothing — the feature is invisible to any client that doesn't ask for it, making it backward-compatible. The server reads the token at the top of the handler and sends notifications/progress messages via server.notification() throughout execution:

const progressToken = extra.params._meta?.progressToken;

async function sendProgress(progress: number, total: number, message: string) {
  if (!progressToken) return;
  await server.notification({
    method: 'notifications/progress',
    params: { progressToken, progress, total, message },
  });
}

// Called at each logical checkpoint during the long operation
await sendProgress(1, 4, 'Fetching records…');
const rows = await db.query(/* … */);
await sendProgress(2, 4, 'Processing 8,432 rows…');
// … etc.
// On the error path — so the client's progress bar resolves:
} catch (err) {
  await sendProgress(4, 4, 'Export failed — see error below');
  return { isError: true, content: [{ type: 'text', text: String(err) }] };
}

Three rules matter most. First, echo the progressToken verbatim — both strings and numbers are valid, and the client uses the token to match notifications to the specific call that requested them. Second, rate-limit notifications to roughly one every 500 milliseconds for tight loops; sending one per row in a 100,000-row export will flood the connection. Third, always send a final notification on both success and error paths — otherwise a client that opened a progress dialog will never know to close it.

The proxy-buffering requirement is non-obvious: SSE-based MCP connections require proxy_buffering off in nginx or flush_interval -1 in Caddy at the gateway. Without it, progress notifications accumulate in the proxy buffer and arrive in a burst at the end of the operation — defeating the entire purpose.

Cancellation: the missing path in almost every handler

When a user closes a chat window mid-operation, the MCP client sends a notifications/cancelled message to the server. The MCP SDK translates this into an AbortSignal that fires on the extra.signal property available in every tool handler. Without explicit handling, the handler continues running — holding database connections, making external API calls, and writing to disk — after the client has already moved on.

Correct cancellation handling requires three things. First, pass extra.signal to every downstream async operation that accepts it:

// fetch() honors AbortSignal natively
const response = await fetch(apiUrl, { signal: extra.signal });

// node-postgres via pg-query-stream
const stream = client.query(new QueryStream(sql, params, { signal: extra.signal }));

// Combine cancellation with a per-tool timeout
const timeoutController = new AbortController();
const timeoutId = setTimeout(() => timeoutController.abort(), 30_000);
const combined = AbortSignal.any([extra.signal, timeoutController.signal]);

Second, release all resources in a finally block regardless of how the handler exits — including on the cancellation path:

const client = await pool.connect();
try {
  // … operation
} finally {
  client.release();  // runs on success, error, AND cancellation
  clearTimeout(timeoutId);
}

Third, for operations with write-side effects, use a database transaction and roll back on abort:

await client.query('BEGIN');
try {
  await client.query('INSERT INTO …', values);
  if (extra.signal.aborted) {
    await client.query('ROLLBACK');
    return { isError: false, content: [{ type: 'text', text: 'Cancelled.' }] };
  }
  await client.query('COMMIT');
} catch (err) {
  await client.query('ROLLBACK');
  throw err;
}

The return value on cancellation should be a clean non-error result, not an exception. The MCP session stays open after a cancellation — the user may immediately issue a different tool call — and an unhandled exception would surface as an error in the client's UI for what was a deliberate user action.

SSE disconnects fire the same extra.signal as explicit client cancellations. A user who closes the browser tab mid-operation does not send a polite notifications/cancelled message; the SSE transport detects the disconnect and fires the signal. Cancellation handling therefore also covers abrupt disconnection.

Binary content: returning images and files from tool handlers

MCP tool handlers can return three content types: text, image, and resource. Most tools return text. Some tools — screenshot capture, chart generation, document export, image transformation — produce binary output that is meaningless as text and must be returned as encoded image or file data.

The image content type uses base64 encoding:

import { readFile } from 'fs/promises';
import sharp from 'sharp';

// Read image, resize to keep payload manageable, encode as base64
const raw = await readFile(screenshotPath);
const thumbnail = raw.byteLength > 512_000
  ? await sharp(raw).resize({ width: 1280, withoutEnlargement: true }).png().toBuffer()
  : raw;

return {
  content: [
    { type: 'text', text: `Screenshot captured: ${width}×${height} px` },  // LLM context
    { type: 'image', data: thumbnail.toString('base64'), mimeType: 'image/png' },
  ],
};

Always send a text description before the image. LLMs process the content array sequentially, and a description before the image gives the model context for what it is about to see. Without it, the model receives raw pixel data with no framing.

Client support varies in ways that matter for tool design. Claude Desktop and Cursor render PNG and JPEG inline. SVG and PDF are typically downloaded rather than rendered. Audio and video are not rendered in any current MCP host. Designing tools that return binary content therefore requires knowing which hosts your users connect from and choosing output formats accordingly.

For large files that are neither images nor supported binary types, the resource content type lets a tool return a URI and MIME type that the client can fetch separately — avoiding base64 inflation on the MCP transport channel for files the client would need to download anyway.

Session lifecycle: connections are stateful, not stateless

Each MCP client connection is a session — a stateful association between a transport and its context. Sessions are created during the initialize handshake and destroyed when the client disconnects. Between those points, the server may maintain per-session state: the authenticated user's identity, their tenant ID, a permission set, a rate-limit bucket, a conversation history, or a queue of pending async operations.

The critical invariant in session lifecycle management is pairing: every sessionContextMap.set(id, context) must have an exactly corresponding transport.onclose handler that calls sessionContextMap.delete(id). Missing the delete creates zombie sessions — entries that hold memory and possibly open database connections long after the client has gone. The failure is invisible at startup; it only surfaces under load as memory grows without bound.

const sessions = new Map<string, SessionContext>();

// On new session — called when initialize completes
transport.onsessioninitialized = (sessionId, clientCapabilities) => {
  const ctx = buildSessionContext(sessionId, clientCapabilities);
  sessions.set(sessionId, ctx);

  // THE CRITICAL PAIR — must exist for every set()
  transport.onclose = () => {
    sessions.delete(sessionId);
    ctx.cleanup?.();
  };
};

// TTL eviction for zombie sessions (belt + suspenders)
setInterval(() => {
  const cutoff = Date.now() - SESSION_TTL_MS;
  for (const [id, ctx] of sessions) {
    if (ctx.lastActiveAt < cutoff) sessions.delete(id);
  }
}, SESSION_TTL_MS / 4);

The keep-alive pattern matters for fast disconnection detection. SSE connections that are silently dropped by a NAT device, proxy, or mobile network switch do not trigger an immediate onclose event. A server that sends an SSE comment (: keep-alive\n\n) every 30 seconds detects the dead connection on the next missed response rather than waiting for the underlying TCP timeout, which can be minutes.

Session lifecycle is also where reconnection and resumption state live. For long-running async operations started in one session that a client may disconnect from and reconnect to, the server must maintain a durable queue associated with a session token — not the transport object, which is ephemeral. The session lifecycle layer is where that queue is created, consulted on reconnect, and cleaned up on explicit close.

Multi-server aggregation: composing MCP servers

When an application needs tools from multiple independent MCP servers — a GitHub integration, a Jira integration, a company knowledge base, a data warehouse — the deployment architecture has two options: configure the MCP host to connect to each server independently, or run an aggregator that proxies all of them through a single endpoint.

Client-side multi-server configuration puts the composition in the host application. Each client connects to each server; tool calls route directly to the owning server. This is simple for small numbers of servers and has no single point of failure. It scales poorly when server count is large, when servers require uniform auth handling, or when the same tool name exists in multiple servers.

Aggregator pattern puts the composition server-side. The aggregator connects to each child using SSEClientTransport at startup, collects their tool lists, prefixes each tool name with the child's namespace, and registers prefixed tools on its own server. A client connecting to the aggregator sees a unified tool list and does not need to know how many child servers exist or where they run:

const childClients = await Promise.allSettled(
  CHILD_SERVERS.map(async (cfg) => {
    const transport = new SSEClientTransport(new URL(cfg.url));
    const client = new McpClient({ name: 'aggregator', version: '1.0.0' }, {});
    await client.connect(transport);
    const { tools } = await client.listTools();
    return { prefix: cfg.prefix, client, tools };
  })
);

// Register each child's tools with its namespace prefix
for (const result of childClients) {
  if (result.status === 'rejected') continue;  // child down — log and move on
  const { prefix, client, tools } = result.value;
  for (const tool of tools) {
    server.tool(
      `${prefix}__${tool.name}`,
      tool.inputSchema,
      async (args, extra) => {
        try {
          return await client.callTool({ name: tool.name, arguments: args });
        } catch (err) {
          return { isError: true, content: [{ type: 'text', text: String(err) }] };
        }
      }
    );
  }
}

Three design choices matter. First, use Promise.allSettled at startup, not Promise.all — if one child is temporarily unavailable, the aggregator should register the remaining children's tools and serve partial capability rather than failing entirely. Second, return isError: true from tool handlers when a child call fails, rather than letting exceptions propagate; a child network error should not crash the aggregator's own MCP session. Third, authenticate using a service account for child connections rather than forwarding the upstream client's JWT — the aggregator is an internal infrastructure component and should not hold client credentials.

The notifications/tools/list_changed mechanism handles dynamic aggregation: when a child server adds or removes tools at runtime, it emits this notification to the aggregator, which re-fetches the child's tool list and updates its own registrations. This allows the tool universe exposed to clients to change without restarting the aggregator.

How the five capabilities interconnect

The five capabilities are not independent. They share infrastructure and build on each other in ways that affect implementation order and testing strategy.

Sessions are the substrate for all other capabilities. Progress tokens, cancellation signals, binary content, and child client handles all belong to a session. The session context object established during onsessioninitialized is where per-user identity, tenant ID, and rate-limit state live — the same identity that a tool approval audit log writes and that context propagation carries into tool handlers. A session lifecycle bug — missing cleanup on transport.onclose — causes every other per-session resource to leak. This makes session lifecycle the first capability to get right.

Progress and cancellation share the extra parameter. Both arrive in the handler via the second argument of a tool handler function: extra.params._meta?.progressToken for progress; extra.signal for cancellation. A long-running tool that sends progress notifications should also handle cancellation on the same AbortSignal — they are complementary behaviors for the same class of operation. If progress notifications are only sent when progressToken is present, the progress loop is a natural place to also check signal.aborted and exit early.

Binary content and session context interact through authentication. A screenshot tool that returns { type: 'image', data, mimeType } should enforce the same per-tenant authorization that text-returning tools enforce. Binary content is still content — if the session context carries a tenant ID that restricts which resources are visible, the image tool must honor that restriction. Image payloads are also larger; a server returning megabyte-scale base64 blobs to every session concurrently needs the same MAX_SESSIONS cap that the session lifecycle section describes, or memory pressure from in-flight binary responses can exhaust the process heap.

Multi-server aggregation multiplies all the others. An aggregator proxying tools from four child servers inherits the monitoring surface of all four: if any child has a broken progress notification path, a missing cancellation handler, or a session leak, the aggregator's tool calls expose those bugs. The aggregator's own health_check tool — a convention where the aggregator calls each child's health endpoint and returns a structured status — makes this visible:

server.tool('health_check', {}, async (_args, _extra) => {
  const results = await Promise.allSettled(
    childClients.map(async ({ prefix, client }) => {
      const start = Date.now();
      await client.listTools();
      return { prefix, latency_ms: Date.now() - start, ok: true };
    })
  );
  const statuses = results.map((r) =>
    r.status === 'fulfilled' ? r.value : { prefix: '?', ok: false, error: String(r.reason) }
  );
  return { content: [{ type: 'text', text: JSON.stringify(statuses, null, 2) }] };
});

What these five capabilities mean for monitoring

Every MCP server exposes an initialize path and a tools/list path. These are the paths that a basic health check probes: does the server respond to the MCP handshake? Are tools registered? A server can answer both questions correctly — "yes, I'm alive; yes, here are my tools" — while every extended protocol capability is silently broken.

Consider what breaks invisibly per capability:

Progress notifications broken: The gateway's proxy buffering setting was changed and now holds notifications until the response completes. The tool still succeeds — the final result returns correctly — but _meta.progressToken in the request produces no intermediate updates. The initialize probe passes. The tool call probe passes. Only a probe that sends a real tools/call with a progressToken and waits for at least one intermediate notification would detect the regression.
Cancellation broken: The handler doesn't read extra.signal and doesn't propagate it to database queries. Under normal load this is invisible — queries complete before any client cancels. Under a load spike, thirty concurrent clients each cancel a 30-second query, and the database connection pool exhausts within minutes. The initialize probe is green. The tool call probe is green. The first sign is latency climbing as new tool calls queue for a connection.
Binary content broken: The screenshot tool's Puppeteer dependency was updated and now returns a WebP buffer instead of PNG. The mimeType field still says image/png. Tools that return image content still return isError: false. Any probe that decodes the base64 payload and checks the magic bytes — 0x89 0x50 0x4E 0x47 for PNG — would fail immediately. A probe that only checks HTTP 200 and isError: false would not.
Session lifecycle broken: The transport.onclose handler was accidentally removed in a refactor. Each client connect adds an entry to the session map; no entry is ever removed. Over 24 hours and a few hundred reconnects, the session map holds thousands of stale entries and their associated database connections. The initialize probe opens a fresh session each check and immediately closes it — but it never checks whether the session map grew.
Multi-server aggregator child down: One child server's upstream API subscription lapsed. Its tools are registered in the aggregator's tool list but return isError: true on every call. A client calling those tools gets errors; they don't know the aggregator is involved. The aggregator's own initialize path is healthy. Any probe that calls a tool belonging to the affected child would detect the failure. The standard initialize probe would not.

The shared pattern: each capability adds a new observable behavior that a health check must exercise to verify it works. The more of these capabilities a server implements, the larger its protocol surface — and the more ways it can fail silently if monitoring only probes the initialization path.

This is exactly the monitoring gap that AliveMCP is designed to close. Rather than sending an HTTP GET to /health and checking for 200, AliveMCP probes speak full MCP: executing the initialize handshake, listing tools, and calling real tool endpoints. AliveMCP monitors run on a configurable interval — typically 60 seconds — and alert within a single check interval when a tool starts failing. For servers that implement binary content tools, configuring the probe to call that tool and check the response payload validates the full round-trip. For multi-server aggregators, AliveMCP can be configured with one monitor per child endpoint in addition to the aggregator itself — so a child going down triggers an alert before users encounter the error.

Internal tools — hot reload, test suites, CLI health scripts, pre-deploy smoke tests — verify correctness before deploy. They do not keep watching after deploy. The five protocol capabilities described here are runtime behaviors; they can break at any point after deployment due to environment changes, dependency updates, or traffic patterns. External monitoring is the only mechanism that catches them when they do.