Guide · Model Context Protocol
How to monitor an MCP server
You shipped an MCP server. Now you need to know the moment it goes down, stops responding, or quietly breaks its own tool schema — before a downstream agent starts failing silently.
TL;DR
A TCP ping or plain HTTP 200 check isn't enough for MCP. You need to send a real initialize request, call tools/list, and hash the tool schema — every minute. If the hash changes or the response is malformed, you alert. AliveMCP does this automatically for every public MCP endpoint we discover; join the waitlist to add private endpoints and Slack alerts.
Why this is harder than it looks
MCP servers look like HTTP endpoints, so the obvious approach is to point UptimeRobot or a cron curl at them. That misses three common failure modes we've seen across 2,181 scanned endpoints in our April 2026 audit:
- The TCP socket opens but the JSON-RPC body is malformed. HTTP returns 200, your probe is happy, clients still fail.
- Auth is half-configured.
/mcpanswers, but every tool call returns-32001 Unauthorized. The server is "up" and useless. - Schema drift. A deploy removes a tool or changes its input shape. Nothing 500s; agents just start getting the wrong answers.
None of these show up on a standard uptime probe. You have to speak the protocol.
The minimum probe that actually works
- Send an MCP
initializerequest over HTTP (or SSE if the server is streaming). Check the response hasprotocolVersion,serverInfo, andcapabilitieskeys. If any are missing: down, not degraded. - Follow with
tools/list. This is the real liveness test — it forces the server to walk its tool registry. A dead handler or broken DB binding shows up here, not on a static route. - Hash the tool schema. SHA-256 the sorted list of
(name, inputSchema)pairs. Store last-seen hash. If it changes, record schema drift and alert — this is the silent-breakage case HTTP probes can't catch. - Run it every 60 seconds. Five-minute intervals are fine for a marketing site but far too slow for agent infrastructure — a downed MCP during a customer demo is a 5-minute-long embarrassment.
- Persist response times. p50/p95 latency trend matters: a server creeping from 200ms to 3s hasn't "gone down" but is about to.
Where to run the probe
You have three real options in 2026:
1. Roll your own
A ~50-line Node or Python script in a cron or GitHub Actions schedule. Fine for one server. Breaks down when you have three: no deduped alerts, no dashboard, no historical data past seven days of log retention. Becomes a second product you have to maintain.
2. Generic uptime tools
UptimeRobot, BetterStack, Pingdom. They speak HTTP, not MCP, so you get back the TCP-ping limitation from the top of this page. Useful as a first layer, not sufficient on its own. See our UptimeRobot for MCP servers comparison for the honest tradeoffs.
3. MCP-aware monitoring
AliveMCP. We send real initialize + tools/list calls, hash schemas, flag drift, and feed every public endpoint into a free dashboard so the ecosystem can see the health of the servers it depends on. Private endpoints + Slack + webhook alerts are on the Author ($9/mo) and Team ($49/mo) tiers.
How AliveMCP helps
Every public MCP endpoint we discover in MCP.so, Glama, PulseMCP, Smithery, the Official Registry, and GitHub gets a free /status/<server-slug> page with 90-day uptime, response-time history, and schema-drift events — whether the author signs up or not. Claim your listing to add custom alert webhooks, Slack integration, and a verified-author badge. See the full capability set on the AliveMCP home page or review the pricing tiers.
Related questions
Does a 200 OK from my MCP server mean it's healthy?
No. HTTP 200 means the socket is open and some handler answered — it says nothing about whether the JSON-RPC body is well-formed, whether initialize succeeds, or whether any tool actually works. You need a protocol-level probe.
How often should I probe?
Every 60 seconds is the sweet spot for agent-facing infrastructure: fast enough to catch a flap before a user-visible failure, slow enough not to look like a DDoS. Marketing pages can go longer; an MCP your agents call in real time cannot.
Can I self-host the monitoring stack?
You can roll the script above yourself. For anything more than one server, the operational cost (dedup, dashboards, paging) usually outweighs paying $9/mo for a hosted monitor. Enterprise on-prem is available for teams with strict egress rules.