Deep dive · 2026-04-25 · Probe design

JSON-RPC health checks vs HTTP probes — what an MCP server health check actually checks

A plain HTTP probe answers one question: does a TCP socket on port 443 accept a connection and return a 2xx? That question is necessary but not nearly sufficient for an MCP server, where the protocol layer above HTTP can fail in at least four ways while the socket keeps answering 200. This post walks through what each kind of probe actually verifies, why a JSON-RPC-aware probe is the smallest unit that catches more than half of MCP failures, and how the canonical 50-line probe sequence is structured. The intended reader is anyone responsible for an MCP server who has been on the receiving end of "your tool stopped working last Wednesday and we only noticed today."

TL;DR

An HTTP probe checks the transport. An MCP server health check has to also check (1) the JSON-RPC 2.0 envelope, (2) the MCP protocol version, (3) the tool list shape against the spec, and (4) the tool list hash across probes. We have measured this empirically: of 2,181 public MCP endpoints, 26.7% returned HTTP 200 while failing one of the four MCP-layer checks — invisible to any HTTP-only monitor. The rest of this post is what each layer actually checks, code for the canonical probe, and why we run it every 60 seconds.

The two questions a health check has to answer

"Is the server up?" is a category mistake for any protocol-bearing service. The honest version of the question has two halves: is the transport reachable, and is the protocol still working. They look like one question because for a static-HTML site the second half is trivial — if Caddy returned the file, the protocol works. For an RPC service, the second half is the entire job.

HTTP probes were designed for the static-HTML world. They walk the transport: DNS resolves, TCP connects, TLS handshakes, an HTTP request gets back an HTTP response. If the response is 2xx, the probe says "up." That's a complete answer to "is the transport reachable" and a partial answer to "is the protocol working" — partial because for static HTML, transport == protocol. For everything else, including MCP, it isn't.

JSON-RPC health checks ride on top of an HTTP probe. They reuse the transport answer (DNS, TCP, TLS, HTTP all pass), then they ask the second-half question: does the response, parsed as JSON-RPC 2.0, contain the fields that mean "this server understands the protocol"? An MCP-aware probe extends that one more rung: not just "JSON-RPC works," but "this server's MCP-specific shapes are the shapes the spec defines today." The full layer-by-layer diagnostic ladder for a server that's failing one of these checks is laid out in MCP endpoint not responding; this post is the design rationale behind the ladder, not the runbook.

What an HTTP probe actually verifies

Concretely, the probe sequence behind the green dot on most uptime tools — UptimeRobot, Pingdom, Better Uptime's free tier — is some variation of this:

Resolve the hostname to an IP. Fail if NXDOMAIN.
Open a TCP connection on port 443. Fail if the connection is refused.
Complete a TLS handshake. Fail if the certificate is expired, untrusted, or hostname-mismatched.
Send an HTTP request. Most tools default to HEAD /; the better ones let you configure GET /.
Read the status line. Fail if the code is not in the 2xx range; sometimes also "fail if 3xx and you didn't enable redirect-following."
Optionally, search the body for a substring. Fail if the substring is missing.

What this catches: about everything that breaks below the application layer. DNS lapsed (step 1). TCP socket dead — port closed, host firewalled, hosting platform reaped the project (step 2). TLS expired or chain broken (step 3). 404 on the configured URL because the route moved (step 5). For each of these the probe is correct and useful, and it is correct to run it.

What it doesn't catch: anything that lives in the body of the response. If the route returns 200 with a "service unavailable" HTML page from Render's free-tier sleep handler, the probe says up. If the route returns 200 with a JSON object but the JSON is missing required JSON-RPC envelope fields, the probe says up. If the route returns the right shape but the tools have all silently been removed, the probe says up. The body-substring matcher in step 6 is the escape hatch — but it only works if you know in advance what string to look for, and it gives you a binary "string present / not present" answer, not a structural one.

What a JSON-RPC probe adds

JSON-RPC 2.0 is simple — a request envelope and a response envelope, both JSON. The minimum any compliant server has to do, on every response, is echo back four things: the jsonrpc version string (literally "2.0"), the id from the request, and either a result object on success or an error object with code + message on failure. That's the whole envelope.

A JSON-RPC probe sends a real method call, parses the response as JSON, and asserts on the envelope. The most common shape we use in practice is a no-op method like ping, or — for protocols that don't define a no-op — the cheapest discovery call the protocol offers. For MCP, that's initialize: every conforming MCP server has to implement it, it's parameterless in the simplest invocation, and the response is bounded in size.

Here's what the probe verifies once it has the response in hand:

The body parses as JSON. (HTTP-with-body-substring monitors don't actually parse — they string-match.)
The top-level object has jsonrpc: "2.0". Servers that drift to "jsonrpc": 2 as a number, or omit the field, fail here.
The id echoes the request's id. We have seen servers that always return id: 1 regardless of what was sent — a pattern that breaks any client multiplexing more than one in-flight call on the same connection.
Exactly one of result or error is present. Both, neither, or a non-object value in either, all violate the spec.

That's the JSON-RPC half. It catches every server that returns 200-with-malformed-body and every server that returns 200-with-valid-JSON-but-wrong-envelope. Across our Q2 audit (full numbers in the registry report) that was 200 servers — 9.2% of the dataset — that no HTTP-only monitor would have caught. Of those, the most common single defect was tools returned as a top-level array rather than nested in result.tools, a shape from an early MCP draft that never got loud about being deprecated.

What an MCP-aware probe adds on top of JSON-RPC

JSON-RPC envelope verification is necessary but not sufficient for MCP. The protocol adds structure inside result that is MCP-specific, and the probe has to know about that structure to catch the failures that live in it. There are four MCP-specific things to check on each probe; the four together are what people usually mean when they say "MCP server health check" rather than just "HTTP health check."

1. The `protocolVersion` field

An initialize response includes a protocolVersion field that names the MCP spec version the server implements. The probe asserts the version is one your client list of acceptable versions. A server speaking a deprecated version (e.g. an early 2024-prefixed string) is not actively broken, but it is on a glide path: when the next breaking-change version cuts over, the server stops working with the latest clients without changing its own behaviour.

This isn't strictly a uptime signal; it's a survival signal. We surface it as a yellow-state warning rather than a red-state alert, but we do surface it. Authors who haven't touched a server in nine months almost always learn something from a yellow flag here.

2. The tool list shape

Every healthy MCP server exposes a tools/list method whose response is structured: result.tools is an array, each element is an object with name, description, and inputSchema, and inputSchema is itself a JSON Schema object (not a string). The probe validates each of those shapes. A server that returns inputSchema as a JSON-stringified value rather than a parsed object fails here — that single defect was 38 of the 200 schema-malformed servers in the audit.

A more subtle one: tools as a top-level array. Easy to test for; easy to alert on; easy to fix in the server. The probe catches it the first time it runs.

3. The capabilities advertised vs the capabilities responding

The initialize response advertises capabilities — a small object listing which optional MCP features the server claims to support, like resources, prompts, logging. A self-consistent server, when probed for any advertised capability's discovery method (e.g. resources/list if it advertised resources) returns a valid response. A server that advertises a capability and then 404s on the matching method is half-implementing the spec — usually a sign of an SDK upgrade that pulled in new capability flags but didn't wire the handlers.

For the public-tier probe we don't currently exercise every advertised capability on every cycle; we sample. For Author-tier and above we exercise all of them.

4. The tool list hash, across probes

The single biggest reason a JSON-RPC probe doesn't go far enough on its own: schema drift. The tool list shape can be valid on every individual probe and still change in a breaking way between probes. Today the server has a search tool that takes q; tomorrow it takes query; the probe is happy on both days; the agent platform that integrated against q last quarter is silently broken.

The fix is structural: hash the tools/list response on each probe, compare the hash to the previous probe, and treat any change as an event. Treat structural diffs (a tool added, a tool removed, a parameter renamed, a parameter moved from optional to required) as red events. Treat documentation diffs (a description string changed) as informational. The probe needs memory; that's the meaningful design constraint.

We saw a 7.1% drift rate across 48 hours among the 196 healthy servers in the audit — extrapolated naively to 30 days, almost half of healthy servers will have shifted some tool surface in the time between when an integration was tested and when it next gets exercised in production. The Slack alert payload format for drift events is the same shape as the uptime payload — same hook URL, just a different alert type — which keeps the integration cost low.

The probe in 50 lines

Concretely, here is the loop we run on each endpoint. It is deliberately compact — anyone who wants to wire their own can ship something like this in a Friday afternoon. The interesting parts are the assertions, not the IO.

// Run once per endpoint, every 60 seconds.
// Returns a probe-result row to write into the database.
async function probeMcp(endpoint) {
  const startedAt = Date.now();
  const probeId = crypto.randomUUID();

  // 1. initialize — verify the server speaks MCP at all
  const init = await rpc(endpoint, {
    method: "initialize",
    params: {
      protocolVersion: "2025-03-26",
      capabilities: {},
      clientInfo: { name: "alivemcp-probe", version: "1.0.0" },
    },
  });
  assert(init.jsonrpc === "2.0", "envelope: jsonrpc");
  assert(typeof init.id !== "undefined", "envelope: id missing");
  assert(init.result, "result missing on initialize");
  assert(init.result.protocolVersion, "no protocolVersion advertised");

  // 2. tools/list — verify the tool surface
  const tools = await rpc(endpoint, { method: "tools/list" });
  assert(tools.result?.tools, "tools field missing or in wrong shape");
  assert(Array.isArray(tools.result.tools), "tools is not an array");
  for (const t of tools.result.tools) {
    assert(t.name && t.inputSchema, "tool missing name or inputSchema");
    assert(typeof t.inputSchema === "object", "inputSchema must be parsed");
  }

  // 3. tool list hash — for drift detection across probes
  const canonical = canonicalize(tools.result.tools);  // sort keys, strip docs
  const hash = sha256(canonical);

  return {
    probeId,
    endpoint,
    ok: true,
    latencyMs: Date.now() - startedAt,
    protocolVersion: init.result.protocolVersion,
    toolsHash: hash,
    toolCount: tools.result.tools.length,
  };
}

The assertions throw on failure. The caller catches each throw, classifies the failure (transport vs envelope vs MCP-shape vs drift), and writes a row that says exactly which class of check failed. From those rows you build the per-endpoint health timeline; from the timeline you build the alerts.

The canonicalize step is where the schema drift work lives. It strips human-facing description strings (we don't want a description copy edit to fire as a drift event), sorts object keys deterministically, and emits the canonical bytes that get hashed. The reason to canonicalize before hashing rather than hash the raw response is that tool list responses don't have a guaranteed key order, and an SDK upgrade that reorders fields would otherwise look like drift.

Cadence: why 60 seconds, and why memory matters

You can run this probe at any cadence. The interesting choices are 5 minutes (most uptime tools' default), 60 seconds (what we run), and 10 seconds (what some platform-internal monitoring uses). The trade-offs:

5-minute cadence on a single endpoint: cheap, ~288 probes/day. Misses most of a 15-minute deploy regression. Fine for a monitor that's a backstop on top of platform-level alerting; not fine as your only monitor.
60-second cadence on every endpoint in every public registry: more expensive (~8.6M probes/day across the ecosystem), but matches the latency at which agent platforms route around a broken tool. If a regression ships at 14:00, you want to know by 14:01, not 14:05.
10-second cadence: only valuable for endpoints that are part of a synchronous user-facing flow. For most MCP endpoints, the cost outweighs the benefit.

The deeper design point is that drift detection requires memory. Stateless probes — even at 1-second cadence — cannot detect schema drift, because there is nothing to compare against. They only see the current response. The minimum unit of usefulness for drift detection is "two consecutive probes plus a hash store" — and once you have a hash store, you also have free history for every other signal: latency p50, response-size variance, capability set, protocol version, anything you decide to capture. Statelessness is the wrong default for protocol monitors, and most uptime tools default to it.

For the day-zero tradeoff between writing your own probe and running a hosted one, see how to monitor an MCP server — the rough rule we suggest is "wire your own if you operate one or two servers; pay someone the moment that count grows or you can't afford to be the one waking up to fix the probe."

The capability matrix, restated

Mapping the four kinds of probe against what each catches in practice. Same shape as the failure-mode taxonomy, restated as "what a probe of each kind would see":

Failure	Plain HTTP probe	HTTPS + body parse	JSON-RPC probe	MCP-aware probe + drift hashing
DNS / TCP / TLS	Yes	Yes	Yes	Yes
Hosting platform sleep page (200 + HTML body)	No	Yes (string match)	Yes (JSON parse fails)	Yes
Route moved without redirect (404)	Yes	Yes	Yes	Yes
Half-configured auth (200 + tools/list works, tools/call 401)	No	No	Partial	Yes
Malformed JSON-RPC envelope	No	No	Yes	Yes
Wrong tool shape (`tools` top-level, `inputSchema` as string)	No	No	Partial	Yes
Schema drift between probes	No	No	No	Yes
Protocol version drift below client minimum	No	No	No	Yes

The pattern is the same as last week's failure-mode post: each rung up catches a strict superset of the previous rung's failures. The 53% of "alive but broken" endpoints in the Q2 audit map almost exactly to the modes the bottom three monitor classes miss. If you've been running an HTTP-only monitor on an MCP server, you are running a monitor that cannot, by construction, see half of the things that go wrong with it. That isn't a bug in the monitor; it's a bug in the choice of monitor for the workload.

For why this gap is worth the price difference between an HTTP-only tool and an MCP-aware one, see UptimeRobot vs AliveMCP; the comparison is honest about when UptimeRobot is still the right choice (you operate a static site or a generic HTTP API and don't need protocol awareness).

What to do this week if you ship an MCP

Three concrete steps, ordered by cost-of-effort:

Run the probe on yourself. The 50 lines above will run against any MCP endpoint with one HTTP client and a SHA-256 routine. Run it once, manually, against your production URL — the URL listed in the registries you've submitted to, not your dev server. Most authors find at least one of the four MCP-layer assertions failing the first time they run it.
Add tool list hashing to whatever you already have. If you operate a cron-based health check today, store the hash of the canonical tool list alongside each probe row. The day a hash changes is a day worth a paragraph in your changelog. The cost is trivial; the value is the exact bug class no static probe can see.
Decide explicitly between rolling your own and paying for hosted. If you run one or two MCPs, your time is worth more than the $9/mo Author tier; if you run more than two, run them on a hosted probe so you get free per-server history and a free public status badge. The curl-only test is the absolute floor — fine for a one-shot diagnostic, not fine as a continuous monitor.

What we'll cover next

The next planned post is the MCP authentication primer — what the auth-walled 16.8% bucket from the Q2 audit actually tells us about how to publish a private MCP without losing it from public discovery in the process. After that, the Q3 2026 registry audit (mid-July) will re-run the same probe at the same cadence and report which of the seven failure modes shifted.

If you want a heads-up the moment your own server fails one of these probes, the easiest path is to claim it on the public dashboard. Free for the public-tier alert; $9/mo if you want Slack or webhook delivery on drift events.

Join the waitlist