Topic · Uptime & observability

MCP server uptime monitoring

Uptime monitoring for MCP servers means continuously verifying not just that the host answers, but that the protocol handshake works, tools/list returns the schemas you last shipped, and response times stay in range — all at the 60-second cadence that agent-facing infrastructure demands.

TL;DR

Three uptime signals matter for an MCP server: protocol liveness (a real initialize succeeds), tool health (tools/list returns without error and the schema hash matches the last-known-good), and latency (p50 and p95 aren't trending up). AliveMCP checks all three every 60 seconds, for every public MCP endpoint it can find, and surfaces the results on a free public dashboard. Join the waitlist to claim your server.

What "uptime" means when the server is an MCP

Classical uptime monitoring defines "up" as: a TCP connection succeeds and the target returns a 2xx or 3xx HTTP status within a timeout. That's the right definition for a static site. It's the wrong definition for an MCP server, because:

The server can be up on HTTP but broken on MCP — e.g. the JSON-RPC deserializer panics on every request and returns a 500-wrapped-as-200 with an empty body.
The server can be up on MCP but wrong — a deployment removed a tool, or changed its input schema; downstream agents get malformed arguments rejected.
The server can be up and correct but too slow — a 4-second first-token latency pushes agents into timeout, which the user experiences as "the product is broken."

Real uptime monitoring covers all three. That's the baseline — not a premium feature.

The three signals every monitor should emit

Protocol liveness. An initialize request completes and the response advertises the protocolVersion you expect. No version match? Flag it — the client is talking to the wrong server.
Tool health + schema hash. Immediately after initialize, call tools/list. Hash the returned tool definitions. Compare against last-known-good. A change is not always bad, but it's always something someone should see.
Latency envelope. Track p50 and p95 response times over the last 1h / 24h / 7d windows. Alert on sustained p95 > 3× baseline, not on a single slow call.

Why it matters right now

In April 2026 we scanned 2,181 remote MCP endpoints across the major public registries. Only 9% were fully healthy. The other 91% were either hard-down, returning malformed responses, or had broken auth. Most authors had no idea — because nothing was watching. This is the baseline AliveMCP was built to fix.

How AliveMCP helps

AliveMCP runs a 60-second probe against every MCP endpoint it discovers in MCP.so, Glama, PulseMCP, Smithery, and the Official Registry. For each server, you get a public /status/<server-slug> page with 90-day uptime, response-time trend, and a schema-drift event feed — as shown on the home page. Claim your listing on the Author tier for custom alert webhooks and a verified-author badge; upgrade to Team for private endpoints and a status-page subdomain. See full pricing.

Get early access