Guide · Model Context Protocol

How to monitor an MCP server

You shipped an MCP server. Now you need to know the moment it goes down, stops responding, or quietly breaks its own tool schema — before a downstream agent starts failing silently.

TL;DR

A TCP ping or plain HTTP 200 check isn't enough for MCP. You need to send a real initialize request, call tools/list, and hash the tool schema — every minute. If the hash changes or the response is malformed, you alert. AliveMCP does this automatically for every public MCP endpoint we discover; join the waitlist to add private endpoints and Slack alerts.

Why this is harder than it looks

MCP servers look like HTTP endpoints, so the obvious approach is to point UptimeRobot or a cron curl at them. That misses three common failure modes we've seen across 2,181 scanned endpoints in our April 2026 audit:

The TCP socket opens but the JSON-RPC body is malformed. HTTP returns 200, your probe is happy, clients still fail.
Auth is half-configured. /mcp answers, but every tool call returns -32001 Unauthorized. The server is "up" and useless.
Schema drift. A deploy removes a tool or changes its input shape. Nothing 500s; agents just start getting the wrong answers.

None of these show up on a standard uptime probe. You have to speak the protocol.

The minimum probe that actually works

Send an MCP initialize request over HTTP (or SSE if the server is streaming). Check the response has protocolVersion, serverInfo, and capabilities keys. If any are missing: down, not degraded.
Follow with tools/list. This is the real liveness test — it forces the server to walk its tool registry. A dead handler or broken DB binding shows up here, not on a static route.
Hash the tool schema. SHA-256 the sorted list of (name, inputSchema) pairs. Store last-seen hash. If it changes, record schema drift and alert — this is the silent-breakage case HTTP probes can't catch.
Run it every 60 seconds. Five-minute intervals are fine for a marketing site but far too slow for agent infrastructure — a downed MCP during a customer demo is a 5-minute-long embarrassment.
Persist response times. p50/p95 latency trend matters: a server creeping from 200ms to 3s hasn't "gone down" but is about to.

Where to run the probe

You have three real options in 2026:

1. Roll your own

A ~50-line Node or Python script in a cron or GitHub Actions schedule. Fine for one server. Breaks down when you have three: no deduped alerts, no dashboard, no historical data past seven days of log retention. Becomes a second product you have to maintain.

2. Generic uptime tools

UptimeRobot, BetterStack, Pingdom. They speak HTTP, not MCP, so you get back the TCP-ping limitation from the top of this page. Useful as a first layer, not sufficient on its own. See our UptimeRobot for MCP servers comparison for the honest tradeoffs.

3. MCP-aware monitoring

AliveMCP. We send real initialize + tools/list calls, hash schemas, flag drift, and feed every public endpoint into a free dashboard so the ecosystem can see the health of the servers it depends on. Private endpoints + Slack + webhook alerts are on the Author ($9/mo) and Team ($49/mo) tiers.

How AliveMCP helps

Every public MCP endpoint we discover in MCP.so, Glama, PulseMCP, Smithery, the Official Registry, and GitHub gets a free /status/<server-slug> page with 90-day uptime, response-time history, and schema-drift events — whether the author signs up or not. Claim your listing to add custom alert webhooks, Slack integration, and a verified-author badge. See the full capability set on the AliveMCP home page or review the pricing tiers.

Get early access