Deep dive · 2026-04-25 · Probes

Running a credentialed MCP health check, end to end

An unauthenticated probe can tell you that a Posture C MCP server politely refuses a stranger — handshake clean, WWW-Authenticate header populated, the right kind of 401. It cannot tell you whether the server is actually usable with credentials. For the 366 auth-walled servers in the Q2 2026 audit, that distinction is the entire game. This post is the practical follow-up to the authentication primer: the end-to-end routine for running a credentialed health check on your own authenticated MCP server, the scoped probe-credential design that makes it safe, the eight steps the probe runs every 60 seconds, the canonical-JSON tool-list hash that catches schema drift on authenticated tool lists too, the token-expiry detection that fires three days before your monitoring breaks instead of three days after, and a copy-pasteable shell recipe you can run from a CI box this afternoon.

TL;DR

Mint a single-purpose probe credential with read-only scope and a known expiry. Run an eight-step probe every 60 seconds: DNS, TLS, unauthenticated initialize (must 401 with header populated, not a bare 401), OAuth discovery if published, authenticated initialize, tools/list, a tools/call against a designated read-only health-check tool, and a canonical-JSON SHA-256 hash of the tool list compared against the previous probe. Track three states per probe: healthy (all eight pass), auth-walled (steps 1–3 pass, steps 5–7 fail or unconfigured), broken (steps 1–3 fail, or step 8 hash drifts unexpectedly, or token returns a refresh-needed signal). Add a token-expiry watchdog that pages 72 hours before the probe credential expires — not after. The whole probe is ~120 lines of bash + curl + jq; the recipe at the end of this post is the one we run on the AliveMCP collector.

Why this is a different probe from the unauthenticated one

The probe sequence covered in JSON-RPC health checks vs HTTP probes assumes a server that speaks MCP to anyone who shows up. For a Posture A server (truly public) or even a Posture B server (demo-token public, with the token in the listing), an unauthenticated probe is almost the whole story. Step 1 is TCP + TLS, step 2 is initialize with a real JSON-RPC envelope, step 3 is tools/list, step 4 is a real tools/call against the first listed tool, step 5 is the canonical-JSON hash for drift detection. Five steps, no credential, the server is either healthy or it isn't.

For a Posture C server (sign-up gated) or a Posture D server (truly private, mistakenly listed publicly or correctly hosted on a private registry) that calculus collapses. Steps 3 and 4 will always fail without a credential, regardless of whether the server is healthy. The signal an unauthenticated probe is reading at that point is "is this server polite to strangers" — which is worth knowing, but it is not "is this server alive for its actual users." If the only monitor on a Posture C server is an unauthenticated probe, the server can return a textbook 401 forever while every authenticated request is timing out, and the dashboard will say green.

The credentialed probe is the same idea taken one layer further. A real MCP client — an agent runner, a Claude Desktop installation, a CI pipeline that calls into the server — has a credential. The probe should too. The probe is a lightweight, automated, scoped MCP client whose only job is to verify the server it's pointed at would behave correctly for a real client carrying the same kind of credential.

The four pieces you need before you start

Before any code runs there are four artefacts to assemble. Skipping any of them is how teams end up with a credentialed probe that's worse than an unauthenticated one — either because the credential has too much scope (and the probe becomes a security liability), or because the probe credential's failure looks identical to a server failure (and every alert is ambiguous), or because the credential expires and nobody finds out until a real outage isn't reported.

1. A scoped probe credential

The probe credential is a token — bearer, API key, or OAuth refresh token, depending on the server's auth pattern — that is intentionally less powerful than a real user's credential. Read-only scope. No write access to anything mutable. No access to data that's expensive to read or sensitive to leak. If the server's auth model can't express "read-only" it should be able to express "rate-limited" or "scoped to a specific tool" — either is acceptable. The probe credential is going to be invoked every 60 seconds for the lifetime of the monitor. If the credential has full write scope, every 60 seconds is a full-write-scope token sitting on a probe host — a different threat model than monitoring is supposed to introduce.

For OAuth 2.1 servers this is a service-account or client-credentials-flow token with a specific scope (e.g., mcp:health:read). For bearer-token servers this is typically a separate API key issued specifically for monitoring with a documented expiry. For mTLS servers it's a client certificate issued to the probe with a non-overlapping common name, so the probe's traffic shows up distinct in the server's logs.

2. A designated health-check tool

The probe needs a tool to call that is safe to invoke every 60 seconds and whose output is small and deterministic. Options, in order of preference: a server-defined health or ping tool that returns a fixed string and does no I/O; the smallest read-only tool the server already exposes (typically a get_metadata or list_resources kind of thing); the first listed tool with a documented "safe to invoke without side effects" property. Avoid tools that hit a paid downstream API on every call — a 1,440-call-per-day probe against a $0.001-per-call OpenAI wrapper is $1.50/day to monitor a server, which is unacceptable. If the server doesn't expose a free read-only tool, add one specifically for monitoring; "this server exposes a health tool the registry probes against" is a feature, not overhead.

3. A token-expiry calendar entry

Every probe credential has an expiry. The probe needs to know when. The simplest pattern is a metadata file — a JSON or YAML next to the probe config — with a credential_expires_at field as an ISO 8601 timestamp. The watchdog fires 72 hours before that timestamp. Alternative pattern: if the credential is a JWT, parse the exp claim on every probe and use that as the source of truth. The mistake to avoid is treating credential rotation as a thing humans will remember. They will not. The probe is the most reliable place to enforce it.

4. An alert path that distinguishes credentialed-probe failure from server failure

When the credentialed probe fires an alert, the alert payload should make it explicit which kind of failure happened. "Server returned 401 to authenticated probe" and "server returned 503 to authenticated probe" are different incidents — the first is probably a credential problem, the second is probably a server problem. The alert format covered in MCP server Slack alerts includes a probe_step field on every payload for exactly this reason: the on-call doesn't have to grep the probe code to figure out which step failed.

The eight-step probe sequence

The credentialed probe is the unauthenticated five-step probe with three steps inserted between step 1 (transport) and the protocol-layer checks. Each step is independent — a failure at step n is a specific diagnosis, not a generic "probe failed" signal.

Step 1 — DNS resolution

Resolve the server's hostname to an IP. Cache it with a short TTL (60 seconds is fine; longer than a minute and you'll miss DNS-flap incidents). If the hostname doesn't resolve, the server is in the DNS-or-transport-dead bucket from the audit — every other step is moot. Failure here trips a different alert than auth failures and almost always means a registrar lapse or a Cloudflare-record purge, not a server problem.

Step 2 — TLS handshake

Open a TCP connection on port 443 and complete a TLS handshake. Verify the certificate chain. Read the certificate's notAfter and emit a certificate-expiry watchdog event 30 days before expiry, regardless of probe outcome. TLS expiry is the third-most-common cause of a server falling out of the registry's healthy bucket — always preventable, almost always a missed renewal, and the probe is the cheapest place to catch it.

Step 3 — Unauthenticated `initialize`

Send a JSON-RPC initialize request with no Authorization header. Two acceptable outcomes: a clean initialize response (the server allows unauthenticated handshakes), or an HTTP 401 with a populated WWW-Authenticate header pointing at either a realm or a discovery URL. The unacceptable outcome is a bare 401 with no header — it's the third-most-common shape in the auth-walled bucket and it tells the probe the server's spec compliance has regressed. Mark the unauthenticated handshake as auth-required-with-discovery, auth-required-no-discovery, or handshake-open based on what came back. This three-state result is what the dashboard shows on the public listing.

Step 4 — OAuth discovery (if applicable)

If step 3 returned a WWW-Authenticate header pointing at a discovery URL — typically /.well-known/oauth-authorization-server — fetch it and verify three things: the issuer field matches the server's host, the authorization_endpoint resolves and returns a 200 or 302, and the token_endpoint resolves and accepts the probe credential's grant type. The output of this step is either "discovery published and intact" or "discovery published but broken" or "no discovery." All three are valid for a healthy server; "discovery published but broken" is the failure mode — the server is advertising a token flow that doesn't work, which will mislead any spec-compliant client.

Step 5 — Authenticated `initialize`

Re-send the initialize request with the probe credential attached — either a bearer token via Authorization: Bearer ..., an API key in the server's documented custom header, or an OAuth access token obtained by exchanging the probe's refresh token at the discovery endpoint. The expected response is a normal MCP initialize result with a protocolVersion field, a serverInfo object, and a capabilities array. Any deviation from this shape — a 401 still, a 403, a malformed envelope, an HTTP 200 with an empty body — is a different specific diagnosis. 401 with the credential is "credential rejected" (rotate the probe credential or fix the server's auth path). 403 is "credential authenticated but not authorized" (probably a scope mismatch). Empty body is "server crashed mid-response" (a server-side bug). Each gets a distinct alert.

Step 6 — `tools/list`

Issue a tools/list request on the same authenticated session. Verify the response shape: an array of tool descriptors, each with a name, a description, and an inputSchema. Empty arrays are a soft failure — a server with no tools is not technically broken but is also not useful, and we surface it as a warning. Verify the designated health-check tool from the four-pieces list above is present — if it isn't, the probe credential's scope has drifted or the server's tool list has dropped the health tool. Either is its own incident.

Step 7 — `tools/call` against the health-check tool

Invoke the designated health-check tool with its documented "safe" arguments. Verify the response: a content array with at least one element, no isError: true field, a body that matches the tool's documented output shape. This is the only step in the sequence that actually exercises the server's authenticated request path with the probe credential's scope. Steps 5 and 6 verify the auth handshake; step 7 verifies the auth machinery handles a real call. The number of times step 5 passes and step 7 fails in production is much higher than it should be — typically a downstream-rate-limit or a permission-system mismatch where the credential has the right scope on paper and the wrong scope when the actual tool resolves.

Step 8 — Tool-list canonical-JSON hash

Take the tools/list response from step 6, run it through a canonical-JSON encoder (sorted keys, no whitespace, fixed array order), SHA-256 the result, and compare against the previous probe's hash. If the hash matches, no drift. If it doesn't, the tool list has changed shape — a tool added, removed, renamed, or its inputSchema rewritten — and an alert fires with a diff payload. This is the same drift-detection layer covered in schema drift in MCP tool definitions; the only difference for authenticated servers is that the hash is taken on the credentialed tool list, which can be a different list than the one an unauthenticated visitor would see (some servers expose a richer tool list to authenticated callers).

The probe-credential watchdog (the step everyone forgets)

The eight steps above run on a 60-second cadence and cover the server's behaviour. They do not cover the probe's own credential. Every authenticated probe must have a separate watchdog whose only job is "alert when this credential is about to expire." Without it, the credential expires, the probe starts returning step-5 failures forever, the on-call assumes the server is auth-broken, and the actual server failure that happens three days later is invisible because the dashboard has already been red for three days.

The watchdog runs on a daily cadence (not 60-second — there's no point), reads the credential's expiry from one of three sources in priority order: the exp claim if the credential is a JWT, the credential_expires_at field in the probe's metadata file, or the OAuth introspection endpoint's response if the server publishes one. It fires three escalating alerts: 30 days before expiry (info, "rotate this when convenient"), 7 days before (warning, "rotate this week"), and 72 hours before (critical, "rotate now or the probe goes blind"). The 72-hour alert should page on the same channel as a real server outage, because a blind probe is functionally identical to a real outage from a monitoring perspective.

One more piece worth wiring in: if the server publishes an OAuth refresh-token grant, the probe should refresh proactively when the access token's remaining lifetime drops below 25% — not when it expires. Refreshing late is how probes have hour-long blind windows during token-issuance outages on the auth server.

State machine — three outcomes the probe reports per cycle

Every probe cycle resolves to one of three states. Anything else is a bug in the probe.

healthy — steps 1–7 pass, step 8's hash either matches the previous probe or the diff is recorded but doesn't trip a hard alert (since drift is sometimes intentional). Public dashboard shows green.
auth-walled — steps 1–3 pass with the auth-required-with-discovery or auth-required-no-discovery shape, but the probe is not configured with a credential, or steps 5–7 fail in a way that's specifically a credential problem (401 with the probe credential, 403 with the probe credential, empty tools/list with the probe credential). Public dashboard shows amber. Internal dashboard surfaces the specific step that failed and which alert path fired.
broken — anything else. Step 1 or 2 failed (DNS or TLS dead). Step 3 returned a non-401, non-200 response. Step 5 succeeded but step 7 returned isError: true or 5xx. Step 8's hash drifted in a way that doesn't match the server author's announced changes. Public dashboard shows red. Alerts fire.

The amber state is the one most generic uptime monitors don't have. UptimeRobot reports green or red on every endpoint, which collapses "actually healthy" and "polite-but-not-monitorable" into the same colour. For an MCP fleet that includes Posture C servers, that collapse hides exactly the failure mode the credentialed probe is built to surface.

The shell recipe — ~120 lines you can run today

What follows is the structure of the probe, written as a single bash script with curl and jq as its only dependencies. It runs against a single endpoint; production probes run hundreds of these in parallel from a worker pool. Substitute the variables at the top with your server's URL and credential. The script is idempotent — running it once gives one health check; running it on a 60-second cron gives a continuous monitor.

#!/usr/bin/env bash
# credentialed-mcp-probe.sh — minimal end-to-end MCP health check.
# Dependencies: bash 4+, curl 7.75+, jq 1.6+, openssl, sha256sum.
set -euo pipefail

# --- config -----------------------------------------------------------------
SERVER_URL="${SERVER_URL:-https://mcp.example.com/}"
PROBE_TOKEN="${PROBE_TOKEN:-}"               # bearer or API key
TOKEN_HEADER="${TOKEN_HEADER:-Authorization: Bearer ${PROBE_TOKEN}}"
HEALTH_TOOL="${HEALTH_TOOL:-health}"         # designated read-only tool
HASH_STATE_FILE="${HASH_STATE_FILE:-/var/lib/alivemcp/$(echo "$SERVER_URL" | sha256sum | cut -c1-16).hash}"
EXPIRY_FILE="${EXPIRY_FILE:-/var/lib/alivemcp/credential-expires-at}"
TIMEOUT="${TIMEOUT:-10}"

emit() { printf '{"step":"%s","status":"%s","detail":%s}\n' "$1" "$2" "${3:-null}"; }

# --- step 1: DNS ------------------------------------------------------------
host=$(printf '%s' "$SERVER_URL" | awk -F/ '{print $3}')
getent hosts "$host" >/dev/null || { emit dns fail '"NXDOMAIN"'; exit 21; }
emit dns pass '"resolved"'

# --- step 2: TLS handshake + cert expiry ------------------------------------
not_after=$(echo | openssl s_client -servername "$host" -connect "$host":443 -verify_return_error 2>/dev/null \
  | openssl x509 -noout -enddate | cut -d= -f2)
days_to_expiry=$(( ($(date -d "$not_after" +%s) - $(date +%s)) / 86400 ))
emit tls pass "$(jq -n --arg d "$days_to_expiry" '{days_to_cert_expiry:($d|tonumber)}')"
[ "$days_to_expiry" -lt 30 ] && emit tls warn '"cert expires in <30 days"'

# --- step 3: unauthenticated initialize -------------------------------------
init_payload='{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-03-26","capabilities":{},"clientInfo":{"name":"alivemcp-probe","version":"1.0"}}}'
unauth=$(curl -fsS --max-time "$TIMEOUT" -o /tmp/init.body -w "%{http_code} %{header_json}" \
  -H 'content-type: application/json' --data "$init_payload" "$SERVER_URL" || true)
http_code=$(echo "$unauth" | awk '{print $1}')
case "$http_code" in
  200) emit init.unauth handshake-open '"server allows unauthenticated initialize"'; auth_needed=false ;;
  401) www_auth=$(echo "$unauth" | awk '{$1=""; print}' | jq -r '."www-authenticate"[0] // ""')
       if [ -n "$www_auth" ]; then emit init.unauth auth-required-with-discovery "$(jq -n --arg w "$www_auth" '{header:$w}')"
       else emit init.unauth auth-required-no-discovery '"bare 401, spec regression"'; fi
       auth_needed=true ;;
  *)   emit init.unauth fail "$(jq -n --arg c "$http_code" '{http:$c}')"; exit 23 ;;
esac

# --- step 4: OAuth discovery (best-effort) ----------------------------------
disco_url="$(printf '%s' "$SERVER_URL" | sed -E 's|^(https?://[^/]+).*|\1/.well-known/oauth-authorization-server|')"
if curl -fsS --max-time 5 "$disco_url" -o /tmp/disco.json 2>/dev/null; then
  iss=$(jq -r .issuer < /tmp/disco.json)
  ae=$(jq -r .authorization_endpoint < /tmp/disco.json)
  te=$(jq -r .token_endpoint < /tmp/disco.json)
  emit oauth.discovery pass "$(jq -n --arg i "$iss" --arg a "$ae" --arg t "$te" '{issuer:$i,authz:$a,token:$t}')"
else
  emit oauth.discovery skip '"no discovery published"'
fi

# --- step 5: authenticated initialize ---------------------------------------
[ "$auth_needed" = false ] && [ -z "$PROBE_TOKEN" ] && { emit init.auth skip '"server is open"'; }
if [ -n "$PROBE_TOKEN" ]; then
  auth_init=$(curl -fsS --max-time "$TIMEOUT" -H "$TOKEN_HEADER" -H 'content-type: application/json' \
    --data "$init_payload" "$SERVER_URL" || echo "")
  pv=$(printf '%s' "$auth_init" | jq -r '.result.protocolVersion // empty')
  [ -z "$pv" ] && { emit init.auth fail '"no protocolVersion in response"'; exit 25; }
  emit init.auth pass "$(jq -n --arg p "$pv" '{protocolVersion:$p}')"
fi

# --- step 6: tools/list -----------------------------------------------------
tl_payload='{"jsonrpc":"2.0","id":2,"method":"tools/list"}'
tl=$(curl -fsS --max-time "$TIMEOUT" -H "$TOKEN_HEADER" -H 'content-type: application/json' \
  --data "$tl_payload" "$SERVER_URL")
tool_count=$(printf '%s' "$tl" | jq '.result.tools | length')
[ "$tool_count" -eq 0 ] && { emit tools.list warn '"empty tool list"'; }
has_health=$(printf '%s' "$tl" | jq --arg n "$HEALTH_TOOL" '[.result.tools[]? | select(.name==$n)] | length')
[ "$has_health" -eq 0 ] && { emit tools.list fail "$(jq -n --arg t "$HEALTH_TOOL" '{missing_tool:$t}')"; exit 26; }
emit tools.list pass "$(jq -n --argjson c "$tool_count" '{tool_count:$c}')"

# --- step 7: tools/call against the health tool ----------------------------
call_payload=$(jq -n --arg n "$HEALTH_TOOL" '{jsonrpc:"2.0",id:3,method:"tools/call",params:{name:$n,arguments:{}}}')
call=$(curl -fsS --max-time "$TIMEOUT" -H "$TOKEN_HEADER" -H 'content-type: application/json' \
  --data "$call_payload" "$SERVER_URL")
is_error=$(printf '%s' "$call" | jq -r '.result.isError // false')
[ "$is_error" = "true" ] && { emit tools.call fail "$(printf '%s' "$call" | jq -c .result)"; exit 27; }
emit tools.call pass '"isError=false"'

# --- step 8: canonical-JSON hash of tool list ------------------------------
new_hash=$(printf '%s' "$tl" | jq -cS '.result.tools | map({name,description,inputSchema})' | sha256sum | cut -d' ' -f1)
old_hash=$(cat "$HASH_STATE_FILE" 2>/dev/null || echo none)
echo "$new_hash" > "$HASH_STATE_FILE"
[ "$new_hash" = "$old_hash" ] && emit tools.hash same '"no drift"' || emit tools.hash drift "$(jq -n --arg o "$old_hash" --arg n "$new_hash" '{old:$o,new:$n}')"

# --- watchdog: probe credential expiry ------------------------------------
if [ -f "$EXPIRY_FILE" ]; then
  cred_exp=$(cat "$EXPIRY_FILE")
  hours_left=$(( ($(date -d "$cred_exp" +%s) - $(date +%s)) / 3600 ))
  [ "$hours_left" -lt 72 ] && emit credential.expiry critical "$(jq -n --arg h "$hours_left" '{hours_left:($h|tonumber)}')"
fi

echo '{"overall":"healthy"}'

The script emits one JSON line per step — trivially pipeable into jq, into a SQLite events table, into a webhook payload, or into the alert formatter described in MCP server Slack alerts. Every emit call is one event with an explicit step and status, so on-call dashboards don't have to grep stack traces. The exit codes (21 for DNS, 23 for unauth-init, 25–27 for auth-layer failures) let cron and systemd timer units distinguish failure classes without parsing stdout.

The failure modes that catch teams the first time they wire it up

This list is the failure modes we hit against real servers in the AliveMCP collector's first three months. None are theoretical — each one cost an on-call alert.

The probe credential has too much scope

The most common mistake. The team mints a probe credential by copying a developer's personal access token. The token has full write scope. The probe is now invoking a write-scope token from a probe host every 60 seconds. The fix is straightforward — mint a separate scoped credential — but it has to be in the runbook because the temptation to "just use my token to get the probe working" is universal.

The health-check tool isn't actually free

Common in MCPs that wrap paid SaaS APIs. The team picks list_repositories as the health-check tool because it's read-only and has a small payload. Two weeks later the GitHub bill goes up by $40 because the probe is hitting the GitHub API every 60 seconds, the cache layer doesn't apply to the health-check tool's specific call shape, and nobody notices until the monthly bill lands. The fix is a server-side health tool that returns a fixed string and does no I/O — the recipe above defaults to one. The general rule: if the health-check tool's downstream cost is non-zero per call, the probe at scale will be expensive at scale.

The OAuth discovery endpoint is on a different host than the MCP server

Some servers proxy MCP through a CDN edge but run the OAuth issuer on a separate hostname. The recipe above derives the discovery URL from the server's host — wrong, in this case. The fix is to read the discovery URL from the WWW-Authenticate header in step 3 instead of inferring it. Production probes do exactly that; the recipe above keeps the inference for clarity and falls back gracefully when the inferred URL 404s.

The tool list is unstable in a way that's not actually drift

A small fraction of servers reorder their tool list non-deterministically between requests — typically because the underlying tool registry is a hashmap and the language's iteration order isn't stable. The canonical-JSON hash will flip between two values forever, the drift alert will fire on every other probe, the on-call will get tired of it, and someone will disable drift alerts entirely. The fix is to sort the tool array by name before hashing. The jq -cS '.result.tools | map(...)' in step 8 emits sorted keys but does not reorder the array; production probes add an explicit sort_by(.name) before the hash. Worth fixing on day one.

The probe credential's failure looks identical to a server failure on the first probe after rotation

When the team rotates the probe credential, the probe runs once with the new credential and fires a step-5 alert if the rotation accidentally pasted the wrong token. The on-call gets paged for what looks like a server outage. The fix is a one-shot dry-run mode for the probe: when the credential changes, the next 5 probes are emitted as candidate events, and only after 5 consecutive successes does the credential become live. Adds a 5-minute deployment delay; saves several false-page incidents per quarter.

Step 8's hash file gets accidentally checked into git

If the hash state file lives next to the probe code in the repo, a careless git add . commits it, the next probe sees a stale hash from the old branch, and a drift alert fires for "drift" that's actually just a clean local hash file. The fix is to put the hash state file in /var/lib/alivemcp/ (as the recipe does) and add a .gitignore entry for any hash-state file pattern. Embarrassing the first time; cheap to never do again.

What "AliveMCP Author tier" handles for you

The shell recipe above is intentionally minimal so anyone can audit it and run it themselves. It is also approximately the same logic that runs in the AliveMCP collector for every endpoint we monitor, with a few extras that the indie-author tier wires in by default rather than asking the author to script:

Probe-credential vault — the credential is stored encrypted at rest and only decrypted in the probe worker's memory for the duration of a single probe cycle. The author never has to operate a credential store of their own.
OAuth refresh-token rotation — the collector negotiates fresh access tokens on the documented schedule against any server that publishes a refresh-token grant, with a 25%-remaining-lifetime threshold for proactive refresh.
Multi-region probe sources — the same probe runs from three geographic regions every 60 seconds. A single-region failure is recorded but does not trip a hard alert; a two-region failure does. This catches edge-cache-localised outages that a single-region probe would miss.
Drift-event diff payloads — step-8 hash drift produces a structured diff (added/removed tool names, parameter renames, description changes) and ships it on the alert payload, not just "hash changed."
Token-expiry watchdog with three-tier escalation — 30/7/3-day warnings on a separate channel from server-outage alerts, so on-call doesn't have to triage credential lifecycle as part of incident response.

The honest summary: the eight-step probe is straightforward to build, but production-grade authenticated monitoring has half a dozen edge cases that take a quarter to find and another to wire correctly. The Author tier exists for indie authors who would rather pay $9/mo than spend that quarter. The shell recipe exists for everyone who wants to start tonight and decide later whether the $9/mo is worth it.

How this fits the rest of the AliveMCP probe stack

The credentialed probe is one layer of a four-layer stack. The 30-second curl test is the human-grade liveness check. The JSON-RPC vs HTTP probe is the canonical unauthenticated probe. The drift detector is the tool-list-shape probe. The credentialed probe described here is the authenticated-execution probe — specifically built for Posture C and Posture D servers. The full monitoring page walks through which signals belong on the same dashboard panel; the short version is "all four, on the same row, with the credentialed probe state taking precedence on Posture C/D listings."

For the Q3 2026 audit re-run (mid-July), we're going to run the credentialed probe against every server whose author has claimed their listing and supplied a probe credential through the Author tier. The expectation is that the auth-walled bucket from Q2 splits roughly two ways: the 14% of the bucket that's Posture D mistakes will move to "delisted from public registry" or stay auth-walled, and the rest will either move to "credentialed-probe healthy" (good outcome — the server was always usable, just unverifiable from outside) or to "credentialed-probe broken" (bad outcome but a true positive — we surface a server that's been broken for users the whole time). Either way the bucket gets less ambiguous, which is the point.

What we'll cover next

This is post #6 in the Q2-audit-driven series. The first five posts established the audit, the seven failure modes, the JSON-RPC probe layer, the schema-drift detector, and the authentication primer. This post is the first of the practical-routine series — the playbooks that turn each failure-mode finding into a thing an indie author can wire up in an afternoon. Up next: the Q3 2026 registry audit (mid-July re-run, with bucket-by-bucket movement vs Q2 — including whether the credentialed-probe rollout shrinks the auth-walled bucket as expected), and a follow-up walkthrough on multi-region probe deployment for teams running on a single cloud region.

If you operate an authenticated MCP server and want the credentialed probe wired up without writing the bash, claim your listing on the public dashboard. The Author tier covers the credential vault, the refresh-token rotation, multi-region probes, drift diffs, and the expiry watchdog. $9/mo for Slack or webhook delivery the moment any of the eight steps fail, with the failed step on the payload so on-call doesn't have to guess.

Join the waitlist