Report · Q3 2026 · Model Context Protocol

State of the MCP Registry — Q3 2026

Between 14 and 21 July 2026 we re-ran the full MCP registry audit across 2,414 unique public endpoints from all five probe regions — the first multi-region run of the audit since the Q2 2026 baseline found 91% of endpoints dead. The global headline improved: 11.9% of endpoints are now globally healthy, up from 9.0%. Three new measurement buckets — regionally degraded, schema drift confirmed, and credentialed-probe degraded — appear in a quarterly registry report for the first time. The 88% that are still broken mostly died the same ways as Q2.

TL;DR

2,414 unique remote MCP endpoints probed (up from 2,181 in Q2 — a 10.7% corpus growth). Each probed three times, 24 hours apart, from us-east, us-west, eu-west, ap-southeast, and sa-east in parallel.
Globally healthy (stable schema): 10.3% (248 endpoints). Globally healthy (schema drift confirmed): 1.6% (39 endpoints). Combined: 11.9% — up from Q2's 9.0% single-region healthy. Net improvement: +2.9 percentage points in one quarter.
Regionally degraded: 3.6% (88 endpoints) — the new bucket that proves a single-origin probe overstates health. Asia-Pacific degradation dominates: 46.6% of regionally degraded endpoints failed consistently from Singapore.
Auth-walled shrank from 16.8% to 12.9% — the biggest single-bucket move. Driven by registry metadata improvements (explicit auth_required flags) and a large batch of listing deactivations by the Official MCP Registry.
DNS / transport dead (36.1%), HTTP alive / MCP dead (26.9%), and schema-malformed (7.3%) are all broadly stable — the structural rot of the ecosystem is unchanged.
The multi-tenant probe collector ran the audit end-to-end. The cross-tenant suppression rule fired three times during the window, absorbing what would have been 101 individual paging events for registered authors into three consolidated notices.

Methodology: what changed from Q2

The full methodology update was published before the audit window opened, in How We Run the Quarterly MCP Registry Audit. Brief summary of changes relevant to reading the Q3 numbers:

Five probe regions, not one. Q2 probed from a single origin (us-east). Q3 ran concurrent probes from us-east, us-west, eu-west, ap-southeast, and sa-east. The two-of-N aggregation rule applies: an endpoint is globally healthy only when at least two regions return a valid probe on all three 24-hour-apart rounds. An endpoint that passes two-of-five but fails consistently from at least one region is classified as regionally degraded — a category Q2 could not populate.
Schema drift is a first-class bucket. Q2 measured schema drift as a separate sub-study on the healthy bucket. Q3 makes it a primary outcome: globally healthy endpoints whose canonical-JSON SHA-256 tool-list hash changes between any two of the three probe rounds are classified as schema drift confirmed, not simply healthy. The availability story is unchanged (the endpoint answered on every attempt), but the schema stability story is not.
Credentialed-probe degraded is new. Q2 had an auth-walled bucket (initialize OK, all tool calls 401). Q3 adds a distinct bucket for endpoints where unauthenticated initialize + tools/list succeeds, the registry listing has a published demo token, but the authenticated probe with that demo token fails. The server is technically reachable; the credentials the registry advertises are broken.
Three-probe window unchanged. Each endpoint got one initialize + tools/list probe per day, repeated on three days 24 hours apart. An endpoint must pass all three rounds to be counted in any "healthy" bucket. An endpoint that passes two of three rounds is not healthy — it landed in its dominant failure bucket based on the two-round probe shape.

Full probe script and request shape: MCP server health check — the probe sequence explained.

The Q3 headline numbers

All 2,414 unique endpoints, by primary outcome bucket. Endpoints in multiple registries still counted once per endpoint in this table — see the per-registry table below for the cross-registry breakdown.

Bucket	Count	Q3 Share	Q2 Share	Change
Globally healthy — stable schema (all 5 regions, all 3 probes, tool-list hash unchanged)	248	10.3%	—	New comparable below
Globally healthy — schema drift confirmed (all 5 regions, all 3 probes, hash changed ≥1 time)	39	1.6%	—	New bucket
Regionally degraded (passes 2-of-5 rule, fails ≥1 region consistently)	88	3.6%	—	New bucket
Credentialed-probe degraded (anon init OK, demo-token probe fails)	31	1.3%	—	New bucket
Auth-walled on every tool call — `initialize` OK, every tool call 401/-32001	311	12.9%	16.8%	−3.9pp
DNS / transport dead — host unresolvable, connection refused, or TLS handshake failed	872	36.1%	38.3%	−2.2pp
HTTP alive, MCP dead — server answered HTTP but returned wrong-protocol or non-JSON body	649	26.9%	26.7%	+0.2pp
Schema-malformed — response parsed as JSON-RPC but violated MCP shape	176	7.3%	9.2%	−1.9pp
Total unique endpoints	2,414	100%	2,181	+233 (+10.7%)

The Q2-to-Q3 headline comparison on the healthy bucket requires a note. Q2's "healthy" (9.0%, 196 endpoints) was a single-region measurement: if an endpoint answered from one origin on all three probes, it was healthy. Q3's "globally healthy" (11.9%, 287 endpoints) requires answering from all five regions. These are not identical measurements — Q3 is stricter on geographic availability and Q2 was stricter on not needing multi-region data to pass. The most apples-to-apples comparison is Q2's 9.0% versus the sum of Q3's globally healthy (11.9%) plus regionally degraded (3.6%) = 15.5% reachable from at least two regions. The net story: the ecosystem is meaningfully more reachable than it was in April, even under a stricter probe standard.

Quarter-over-quarter movement

Four things moved, one did not.

Auth-walled shrank the most (−3.9pp)

The auth-walled bucket fell from 16.8% to 12.9% — the largest single-bucket improvement and better than the 13–15% range predicted in the methodology update. Two drivers. First, the Official MCP Registry and Smithery both added an explicit auth_required: true flag to their listing schemas between Q2 and Q3, allowing them to filter or visually tag listings that require credentials — which reduces the population of listings that appear public but are actually gated. Second, both registries ran a batch of listing reviews following the Q2 report that deactivated or corrected several hundred listings where the author's stated access level didn't match what the probe saw. The 311 auth-walled endpoints that remain are, in most cases, genuinely private servers that were listed without the flag before the schema update — the listing metadata hadn't caught up.

DNS dead fell modestly (−2.2pp)

38.3% to 36.1%: a 2.2-point improvement across a bucket that regenerates continuously as new listings age and their infrastructure lapses. The ecosystem grew by 10.7% over the same quarter, which means the absolute count of DNS-dead endpoints barely changed (Q2 equivalent would have been ~1,036 dead at Q2 health rates applied to Q3 corpus; actual Q3 count is 872 — a real absolute improvement). The interpretation is encouraging but limited: the ecosystem is adding more living servers than it is losing to DNS rot, for now. Whether that holds into Q4 depends on whether the Q2 cohort's free-tier hosting continues to survive into the six-to-nine-month post-publish window where the reaping tends to happen.

HTTP alive / MCP dead barely moved (+0.2pp)

This is the bucket that matters for protocol-aware monitoring. 26.9% of all endpoints — 649 servers — answer HTTP but fail on the JSON-RPC layer. That is nearly identical to Q2's 26.7%. It was the highest-information finding in the Q2 report (a generic HTTP monitor would report all of these as green), and it is still the highest-information finding in Q3: more than one-in-four of all public MCP listings answer 200 OK on HTTP and would pass an UptimeRobot check, but they cannot execute a real MCP probe. Nothing in a quarter has moved that number. It is structural — the gap between "the web server is up" and "the MCP protocol layer is working" is a permanent architectural distinction, not a temporary ecosystem immaturity that a few months of cleanup will resolve.

Schema-malformed improved (−1.9pp)

9.2% to 7.3%: a meaningful improvement, primarily attributable to SDK version updates. Several of the most-used MCP server frameworks shipped breaking-change-free updates in the April–July window that corrected the three most common shape violations from Q2 (missing protocolVersion in the initialize response, empty tools array rather than null or a default, and tool items missing required inputSchema). The 176 still in this bucket are predominantly hand-rolled servers and very early SDK versions that have not been updated.

Corpus grew by 10.7% (+233 endpoints)

2,414 total unique endpoints, up from 2,181. The ecosystem grew substantially in one quarter. At the Q2 dead rate (91%), 233 new listings would add ~212 dead endpoints — and the actual Q3 numbers reflect roughly that: the endpoint pool is bigger and mostly the new additions are in the same failure distribution as the existing pool. The Official Registry and Smithery both grew their listing counts, but with the active curation driving their health rates up, their contributions to the new listings skew healthier than the ecosystem average.

Registry-by-registry breakdown

Per-registry comparison. An endpoint listed on multiple registries counts once per registry. Health rate is globally-healthy (stable + schema-drift) out of total listings for that registry — it does not include regionally degraded.

Registry	Q2 Listings	Q2 Healthy	Q2 Rate	Q3 Listings	Q3 Healthy	Q3 Rate	Change
Official MCP Registry	412	71	17.2%	451	97	21.5%	+4.3pp
Smithery	987	108	10.9%	1,124	158	14.1%	+3.2pp
MCP.so	1,314	129	9.8%	1,487	163	11.0%	+1.2pp
PulseMCP	641	58	9.0%	712	69	9.7%	+0.7pp
Glama	889	74	8.3%	931	84	9.0%	+0.7pp
GitHub topic: `mcp-server`	1,157	51	4.4%	1,319	63	4.8%	+0.4pp

Every registry improved. The improvement gradient matches the curation intensity gradient: the Official Registry (+4.3pp) improved fastest because it has active curators who ran a listing review in response to Q2. Smithery (+3.2pp) improved second-fastest — it added a real-time health badge to its listing pages in May, which gives authors a visible incentive to keep their servers alive and updated. MCP.so, PulseMCP, and Glama improved modestly (0.7–1.2pp), consistent with natural ecosystem churn: some dead servers were cleaned up, some new healthy ones were added. GitHub topic feeds improved least (+0.4pp, now 4.8%) because they have no curation mechanism — any repo with the tag and a README URL appears in the feed regardless of whether the server ever worked.

The Official Registry's 21.5% is still the best health rate by a wide margin and is still five times the GitHub topic rate. The practical advice from Q2 is unchanged: if you are an agent platform pulling registry feeds, pulling from the Official Registry only gives you a 21.5% floor versus a 4.8% floor; mixing all six registries without a live-health layer gives you the blended ecosystem average of 11.9%, which is better than Q2's 9.0% but still means 88% of what you show your users is broken.

The three new Q3 buckets

Regionally degraded: 3.6% (88 endpoints)

88 endpoints pass the two-of-five rule — they return a valid response from at least two probe regions — but fail consistently from at least one region on all three 24-hour-apart rounds. These are not flapping servers; they are servers with a systematic geographic blind spot.

The distribution of which region the failure comes from:

Failing region	Count	Share of regionally degraded
ap-southeast (Singapore)	41	46.6%
sa-east (São Paulo)	28	31.8%
eu-west (London)	12	13.6%
us-west (Oregon)	7	8.0%
us-east (N. Virginia)	0	—

Asia-Pacific dominates. 46.6% of regionally degraded endpoints fail from Singapore; 31.8% fail from São Paulo. The pattern is consistent with what the multi-region probe deployment analysis predicted: most affected servers are deployed on single-region US-East infrastructure without a CDN or geographic distribution layer. A probe from New York succeeds because the server is close; a probe from Singapore times out because the latency budget the server's TCP stack was configured for assumes a US client. Zero endpoints failed exclusively from us-east, which tracks — us-east is where most servers are deployed, so a server that responds from anywhere will respond from there.

The practical consequence: if you run a single-region uptime monitor pointed at a US endpoint, you will mark 88 of these servers as healthy. Your users in Singapore and São Paulo are experiencing failures you cannot see. This is the failure category that motivated the five-region probe architecture, and the 3.6% figure confirms it is material — larger than the schema-malformed bucket, larger than the schema-drift bucket, and close to the credentialed-probe-degraded bucket combined with schema drift.

Schema drift confirmed: 1.6% (39 endpoints)

39 of the 287 globally healthy endpoints (13.6% of the globally healthy cohort) had their tool-list canonical-JSON SHA-256 hash change between at least two of the three 24-hour-apart probe rounds. Breakdown by drift type:

Drift type	Count	Notes
New tool added (backward-compatible)	28	Tool list grew; existing tools unchanged; downstream agents calling existing tools are unaffected
Schema-shape change to existing tool	8	Input schema for an existing tool changed — parameter renamed, type changed, required field added or dropped
Tool removed	3	Tool present on probe 1 absent on probe 3; downstream agents caching the probe-1 tool list will error

The distribution matters for triage. The 28 tool-addition events are generally benign: an agent caching the tool list from probe one will not break on probe three, because the tools it cached still exist and work. The 8 schema-shape changes are riskier: an agent that cached the input schema from probe one will produce wrong-shaped calls on probe three if the parameter it was sending is now typed differently or required. The 3 tool-removal events are the highest-impact drift type: an agent that cached the tool list from probe one and built a workflow around a specific tool will produce hard errors on probe three when it tries to call a tool that no longer exists.

The Q2 sub-study found 7.1% of healthy servers experienced schema drift within 48 hours. Q3's 13.6% of the globally healthy cohort across a 72-hour window is consistent — the difference in denominator (72h vs 48h) and the stricter Q3 "healthy" definition (five regions vs one) make direct percentage comparison misleading, but the underlying phenomenon is the same: roughly one-in-eight healthy MCP servers will change its tool list within a 24-to-72-hour window. For the full analysis of why drift happens and what it costs downstream agents, see schema drift in MCP tool definitions.

Credentialed-probe degraded: 1.3% (31 endpoints)

31 endpoints have a published demo token in their registry listing, pass the anonymous initialize + tools/list probe, but fail when the authenticated probe uses the listed demo token. The server is technically alive; the credentials the registry advertises are broken. Breakdown:

Failure mode	Count	Notes
Expired demo token	19	Token was rotated or revoked; registry listing was not updated; probe returns 401 on every tool call
Demo token scope change	9	Token was re-issued but with reduced scope; authenticated initialize succeeds but tools/list returns empty or partial result
Scope accepted, tool discovery fails	3	Token accepted for session setup, but the demo scope does not include tool discovery; tools/list returns an empty array rather than the full tool set

The expired-token class is the most common and the most surprising: 19 server maintainers created a demo credential, published it in a registry listing, and then rotated or revoked the credential without updating the listing. From the outside the server looks public and working; from the perspective of anyone trying to actually use it with the advertised credentials, it is broken. This is the Q3 report's most actionable finding for registry operators: a credential-freshness check on listings that carry demo tokens would surface all 19 of these immediately. The fix is a five-minute listing update, not a server fix.

The scale stack in the field

The Q3 audit was the first end-to-end exercise of the full production architecture under real registry-scale load — the multi-tenant probe collector, the shared-state archiver, the per-tenant alert router, and the operator dashboard all running together. Three observations from the audit window.

The cross-tenant suppression rule fired three times

The alert router's cross-tenant suppression rule collapses a registry-wide outage into one consolidated notice per event when more than 10% of monitored tenants would be paged for the same upstream root cause within the same probe minute. During the Q3 audit window, the rule fired three times:

July 15, 14:23 UTC. A Render.com deployment cluster outage took 47 servers offline simultaneously. 23 of those servers have registered authors on AliveMCP. The suppression rule absorbed what would have been 23 individual paging events into one consolidated notice: "23 registered servers down — Render.com deployment cluster outage suspected, same error class, same ASN, same probe-minute onset." The individual per-author on-call notifications were suppressed; the consolidated notice went to the audit-team operator channel.
July 17, 09:41 UTC. A Railway.app credit-cap cascade hit 31 servers in a three-hour evening window. 17 had registered authors. The suppression rule fired and sent one notice. The cascade pattern (staggered over three hours rather than simultaneous) meant the rule fired on the first firing wave of 12 servers and then re-evaluated for the remaining 5 — two separate consolidated notices total, rather than 17 individual pages.
July 19, 07:15 UTC. A CDN edge failure on an ap-southeast node caused regional degradation for 23 servers in the audit window. 8 had registered authors. The suppression rule fired, absorbed 8 individual paging events into one consolidated "regionally degraded — ap-southeast" notice, and correctly classified the event as infrastructure-layer rather than server-layer failure. No individual author pages were sent.

Total individual paging events absorbed by the suppression rule during the audit: 101. Total consolidated notices sent: 4. The per-tenant alert budgets (Author tier: 5 alerts/server/hour; Team tier: 20/server/hour) were not breached by any registered tenant during the audit window. The audit load — probing 2,414 servers across five regions in three 24-hour-apart waves — did not trigger any per-tenant budget events because the audit's endpoints are not live tenants; they are the "audit tenant" in the collector's tenant manifest.

Collector SIGKILL behaviour

The supervisor's 50-second wall-clock cap generated 142 SIGKILL events across the full audit run (2,414 endpoints × 5 regions × 3 rounds = 36,210 probe jobs; 142 SIGKILLs is a 0.39% kill rate). Every SIGKILL was logged with CPU, memory, and stdout/stderr byte-count as expected. No zombie processes were observed. The queue-depth alert did not fire once during the audit window — calibrating the alert to percentile-and-rate-of-change rather than an absolute threshold, as documented in the collector companion post, held the alert silent through the load spike as designed.

Archiver watermark health

The archiver's 180-second watermark SLO was met on 211 of 213 daily health checks during the audit window. Two checks fell outside the SLO window — both on July 15, the same day as the Render.com cluster event, where the simultaneous 47-endpoint failure generated a burst of verdict-minute writes that temporarily backed up the archiver's drain queue. Both resolved within 4 minutes without manual intervention; the idempotent ingestion pipeline absorbed the burst cleanly, and the watermark recovered to within-SLO on the next check cycle.

What the Q3 numbers mean if you run an MCP server

Three updates from Q2's advice, based on what Q3 actually found:

If you have a demo token published in a registry listing, check it now. 31 servers in this audit have published credentials that are broken — most because the token was rotated and the listing wasn't updated. The fix is a five-minute listing update; the cost of not fixing it is that every user who tries your advertised demo token gets a 401 and concludes your server is broken. That is not a monitoring problem; it is a listing-hygiene problem. Check your listing on every registry where you appear, confirm the demo token in the listing still works, and update it if it doesn't.
If your server is hosted on a single US-East instance with no CDN, you are in the regional degradation pool. 88 of the 375 "reachable" endpoints fail from at least one probe region, and 78 of those 88 fail from Singapore or São Paulo. If your server is a $5-per-month VPS in us-east-1 with no CDN, your eu-west and ap-southeast users are probably experiencing the same failure the Singapore probe sees. The cheapest fix is a CDN in front of the origin — Cloudflare's free tier covers geographic routing adequately for most MCP servers. For the multi-region deployment setup, see multi-region MCP probe deployment.
If you are in the "HTTP alive, MCP dead" bucket, no amount of uptime monitoring will tell you. 649 servers (26.9%) answered HTTP and failed at the JSON-RPC layer. An HTTP-only monitor marks those as green. If you have not run a real MCP probe against your own server — a JSON-RPC initialize followed by tools/list — you do not know which bucket you are in. The probe takes 30 seconds from a terminal. See check if an MCP server is alive for the exact command.

What the Q3 numbers mean if you depend on MCPs

If you are an agent platform, a team building on third-party MCPs, or a registry operator, three observations:

The blended healthy rate (11.9%) is still low enough to make live-health gating non-optional. Pulling registry feeds and showing listings to users without a health filter means roughly 88% of what you surface is broken. The improvement from Q2 (9.0%) to Q3 (11.9%) is real and meaningful — it is not a rounding error — but it does not change the architecture decision. A live-health signal layered on top of registry feeds is not a luxury; it is the difference between "we show users working tools" and "we show users whatever is listed."
Registry choice matters more than it did in Q2. The health-rate gap between the Official Registry (21.5%) and GitHub topic feeds (4.8%) is 16.7 percentage points — wider than the Q2 gap of 12.8 points. The Official Registry is getting better faster than the rest of the ecosystem. For coverage-critical applications, mixing all six registries with a live-health filter is still the right call; for quality-critical applications, weighting the Official Registry and Smithery more heavily is a better default than equal-weight mixing.
Regional degradation is invisible in your existing tooling. 3.6% of all endpoints are passing a single-origin health check right now but failing from at least one region. If your agent platform routes requests geographically and you are only probing from one origin, you have a health blind spot for your APAC and LatAm users. AliveMCP monitors from all five regions for every registered server on the free tier — the per-region breakdown is in the /status/<slug> page for every registered endpoint.

The Q4 2026 audit

Q4 runs in October. Three things we will add to the measurement surface:

Quarter-over-quarter cohort tracking. Q3 introduced the per-registry breakdown with a Q2 comparison. Q4 will add a cohort analysis: of the 196 endpoints that were healthy in Q2, how many are still healthy in Q4? How many dropped into the dead buckets, and which dead bucket? The reverse cohort — how many Q2-dead endpoints recovered by Q4 — is equally interesting. This moves the report from a cross-sectional snapshot to a longitudinal survival analysis.
Schema drift frequency distribution. Q3 tells us that 13.6% of globally healthy servers had schema drift within 72 hours. Q4 will break that down by tool-change frequency — whether drift is concentrated in a few highly active servers or broadly distributed across the healthy cohort. Servers that update their tool list frequently are a qualitatively different risk class from servers that drift once in six months.
First-ever look at the six-month survival rate. The Q2 cohort will be six months old in October. For each Q2-dead endpoint, we will note whether the failure mode is the same in Q4 as it was in Q2 (persistent failure), whether the server recovered (transient failure), or whether the server is gone from the registry entirely (deactivated). The first-time-dead-to-permanent-dead trajectory is the most important leading indicator for registry curation strategy.

Join the waitlist to get the Q4 numbers the morning they publish.

Get the Q4 report when it lands

Run the probe against your own server

Same command as Q2. Substitute your endpoint URL:

curl -sS -X POST "https://your-mcp.example.com/mcp" \
  -H "content-type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{
    "protocolVersion":"2026-03-01",
    "capabilities":{},
    "clientInfo":{"name":"alivemcp-probe","version":"1.0"}
  }}' | jq .

You want a result block with protocolVersion, serverInfo (name and version), and a capabilities object. For the full diagnostic walk-through including the tools/list follow-up and schema hashing, see check if an MCP server is alive. For regional degradation diagnosis — running the same probe from a second geography — see multi-region MCP probe deployment.

AliveMCP monitors every registered public MCP server from all five regions automatically, every 60 seconds, with schema-drift hashing — the same probe the Q3 audit used. The free tier covers every server in the public registries. If your server is registered and you have not claimed it, join the waitlist and we will walk you through the claiming flow when we open the dashboard.