GraphQL Integration · 2026-07-02 · MCP GraphQL Integration arc

MCP Servers and GraphQL: From Raw Queries to Apollo Client, Hasura, and Real-Time Subscriptions

GraphQL's typed schema is a natural source of MCP tool definitions: every query argument maps to a tool input parameter, every operation's description becomes a tool description, and introspection gives you the full surface in machine-readable form. On paper, wrapping a GraphQL API as MCP tools looks like a mechanical translation problem. In practice, three things break the naive 1:1 mapping — and all three have the same root cause. This guide synthesizes the full GraphQL + MCP integration surface: why the pairing works so naturally, what breaks it, and how to build a production-grade integration across raw queries, Apollo Client, Hasura, real-time subscriptions, and the two-layer monitoring stack that keeps everything healthy.

TL;DR

Map one GraphQL operation to one MCP tool. Flatten nested input objects to top-level parameters. Return isError: true for GraphQL errors[] responses — HTTP 200 does not mean success. Cap list responses at 20–50 items. For subscriptions, use the poll-and-snapshot pattern (server-managed subscription → in-memory or Redis cache → MCP tools read on demand). For Apollo Client, choose shared-client vs per-session based on whether data is user-specific. For Hasura, expose 10–20 curated tools, not the raw 40–60 auto-generated operations. Monitor the MCP protocol layer with AliveMCP and the GraphQL backend with tool-level error rate metrics — neither alone is sufficient.

Why GraphQL and MCP are a natural pairing

GraphQL was designed around the insight that clients should declare exactly what data they need rather than receiving a fixed server-determined shape. MCP tool definitions express the same idea from the other direction: the tool declares its inputs, its outputs, and what it does, so language models can reason about whether to call it and how to interpret the result.

The structural alignment is strong:

These properties are real advantages. An MCP server wrapping a well-designed GraphQL API can expose a rich, typed tool surface with less hand-coding than wrapping a REST API of equivalent complexity. The friction point is not the happy path — it is three specific failure modes in the translation layer.

The three things that break naive 1:1 mapping

If you take a GraphQL API and mechanically generate one MCP tool per operation, you will end up with a working integration that produces unreliable results in production. The three failure modes are not edge cases — they affect almost every GraphQL-backed MCP server that doesn't explicitly address them.

1. The error model: HTTP 200 ≠ success

GraphQL has a unique error convention: a request that contains errors returns HTTP 200 with a JSON body that includes both a data field (potentially null or partial) and an errors array. An authorization failure, a validation error, a resolver exception — all of these can come back as HTTP 200 responses.

This breaks every HTTP-level monitor. Your uptime checker, your load balancer, your basic curl-based health script — they all see HTTP 200 and conclude the server is healthy. The GraphQL errors are invisible to them. More critically for MCP: if your tool handler does not explicitly check for errors[] in the GraphQL response and map them to isError: true, the LLM receives a nominally successful tool response that contains an error message embedded in the text. The LLM interprets isError: false as confirmation that the operation succeeded — regardless of the text content. This is the most common silent bug in GraphQL-backed MCP servers.

The fix is explicit: catch ClientError from graphql-request (which throws when the response contains errors[]), extract the error messages, and return them as isError: true MCP responses. Do the same for null results from queries where a non-null return was expected. See MCP server with GraphQL for the complete error handling pattern with code.

2. The payload problem: nested schemas blow context budgets

GraphQL makes it trivial to return deeply nested, richly populated responses. A query for a single order can include the order's line items, each line item's product, the product's category, the shipping address with all its fields, tracking information, and audit timestamps — all in a single response with 50+ fields across nested objects. In a browser or mobile client, this is a feature. In an MCP context, it is a context budget problem.

Every byte of a tool's response goes into the LLM's context window. A 50KB nested JSON response does two things that hurt: it consumes context that the model needs for reasoning and future tool call results, and it forces the model to parse and hold a complex nested structure in its reasoning chain. Empirically, tool responses that exceed roughly 2,000 tokens (~8,000 characters of JSON) start degrading model performance on multi-tool reasoning tasks.

The fix is a combination of tight selection sets (return only the fields the LLM can act on), capped list sizes (default 10, max 50 results), and the summary+detail pattern (a fast list tool returns ID + key identifiers, a separate detail tool returns the full record when needed). This is not about being stingy with data — it is about respecting the LLM's context budget so it can reason well across an entire session.

3. The event model: subscriptions and stateless tools are fundamentally different

GraphQL subscriptions are stateful, long-lived WebSocket connections that push events to the client. MCP tools are stateless request-response operations invoked one at a time. These models do not compose naturally. If you open a WebSocket subscription per tool call, you pay connection setup overhead every time and immediately close it — losing the entire reason to use subscriptions. If you open it server-wide on startup, you need infrastructure to buffer events and serve them to concurrent MCP tool calls that arrive asynchronously.

The fix is the poll-and-snapshot pattern, described in detail in the subscriptions section below: your MCP server maintains subscriptions server-side, buffers incoming events in memory or Redis, and exposes MCP tools that read the latest snapshot on demand. The subscription is decoupled from the tool call cycle.

The unifying insight across all three failure modes: MCP as a curated translation layer, not a transparent proxy. A transparent proxy would expose the raw GraphQL surface to the LLM, complete with HTTP 200 errors, context-busting payloads, and streaming events that don't fit the request-response model. A curated translation layer intercepts each failure mode and converts it into something the LLM can work with reliably.

Query-to-tool mapping: the fundamentals

The mapping rules that make a GraphQL-backed MCP tool work reliably in practice:

One operation per tool

Map each GraphQL query or mutation to exactly one MCP tool. query GetUser($id: ID!) becomes get_user. mutation CreateOrder(...) becomes create_order. Avoid combining multiple GraphQL operations into one tool (the tool becomes hard for the LLM to choose correctly) or splitting one operation across multiple tools (the LLM has to orchestrate a setup step before the main call).

Flatten nested input objects

GraphQL mutations commonly use complex input types: createOrder(input: OrderInput!) where OrderInput has five fields. At the MCP layer, flatten these to top-level parameters. LLMs construct tool call arguments more reliably when all parameters sit at the top level rather than inside a nested object.

// BAD — nested input object in MCP tool
server.tool("create_order", "...", {
  input: z.object({ productId: z.string(), quantity: z.number(), addressId: z.string() }),
}, async ({ input }) => { /* reconstructs from nested object */ });

// GOOD — flattened top-level parameters
server.tool("create_order", "...", {
  productId: z.string().describe("Product ID from list_products"),
  quantity: z.number().int().min(1).max(100),
  addressId: z.string().describe("Shipping address ID from list_addresses"),
}, async ({ productId, quantity, addressId }) => {
  // Reconstruct OrderInput internally for the GraphQL call
  const data = await client.request(CREATE_ORDER_MUTATION, {
    input: { productId, quantity, addressId }
  });
  /* ... */
});

Use the summary + detail pattern for lists

A list_orders tool that returns order ID, status, and creation date is fast and cheap. A get_order_detail tool that returns the full record (line items, shipping address, tracking) is slower and larger. Separate them. The LLM calls the list tool to scan what exists, then calls the detail tool only for the record it actually needs. This is the primary mechanism for keeping tool responses within context budget.

Explicit error mapping

Every tool handler should have a try/catch that converts GraphQL errors to isError: true responses. Three categories require explicit handling: errors[] in the GraphQL response (catch ClientError in graphql-request), null data where non-null was expected (check before returning success), and HTTP-level errors like 429 or 500 (include a retry hint in the error text when the API provides one). Full error handling code is in the foundational guide.

Apollo Client in MCP servers: the key decisions

Apollo Client is the most capable GraphQL client for Node.js: normalized caching, in-flight deduplication, link-based middleware, and subscription support. When you're building a multi-session MCP server where the LLM may request the same data multiple times within one reasoning chain, the InMemoryCache is a genuine win. But Apollo was designed for a single browser instance with one user — its assumptions need adaptation for multi-session server context. See MCP server with Apollo Client for the full setup including Node.js configuration, cross-fetch, and error links.

Shared client vs per-session client

This is the most important architectural decision when using Apollo in an MCP server. Get it wrong and you either get cache isolation failures (user A's data leaks to user B) or unnecessary memory overhead.

PatternWhen to useRisk
Single shared clientTools only query public or tenant-agnostic dataCache pollution: same cache key for different users
Per-session clientTools return user-specific results (orders, profile, account)Memory leak if sessions aren't cleaned up on close
No cache (network-only everywhere)Data changes frequently; cache adds confusionHigher upstream query volume, no deduplication benefit

For per-session clients, always wire cleanup to the MCP session close event: call client.stop() and delete the session entry from your Map. Orphaned Apollo clients accumulate connections and memory. If your MCP server framework doesn't expose a session close hook, implement TTL-based eviction: evict any client that hasn't been used in 30 minutes.

InMemoryCache in MCP context

Apollo's cache normalizes results by __typename + id. If the LLM calls get_user("123") three times in one session, the second and third calls return instantly from cache — a genuine benefit when the LLM re-reads a record during multi-step reasoning. The risk is mutation staleness: if a mutation tool changes User:123, the cache still holds the old version. Subsequent reads return pre-mutation state unless you explicitly call cache.evict() and cache.gc() in the mutation's update function.

One default to change from the browser config: set fetchPolicy: "network-only" as the default for queries. In a browser, returning cached results fast is usually right. In MCP context, the LLM expects current data every time it calls a tool — stale cache hits produce inconsistent reasoning.

ApolloError mapping to MCP error responses

Apollo Client throws ApolloError on query failures. An ApolloError has two error sources that require separate handling: graphQLErrors (resolver failures, validation errors, permission denials — returned inside HTTP 200) and networkError (server unreachable, TLS failure, timeout). Map both to isError: true with a descriptive message. The rule: never return a success response with an error message in the text. The LLM reads isError: false as confirmation of success, regardless of the text.

Hasura as MCP backend: controlling the auto-generated surface

Hasura is a GraphQL API generator: point it at a PostgreSQL schema and it instantly generates queries, mutations, subscriptions, and aggregations for every table. For three tables, that is roughly 40–60 operations. For ten tables, it can exceed 150. Exposing the raw Hasura schema to an LLM as MCP tools — one tool per operation — creates an overwhelming tool selection problem that produces inconsistent results. See Hasura MCP server for the complete integration guide; here is the decision framework.

Pattern 1: Curated wrapper (recommended)

Write 10–20 hand-crafted MCP tools that each wrap a specific Hasura operation for a specific use case. The LLM calls search_users with a simple query string parameter — not a users_bool_exp object with _or, _ilike, and other Hasura-specific filter operators. The tool handler constructs the Hasura query internally. This is the right pattern for most Hasura + MCP integrations.

Pattern 2: Introspection-based generation with an allowlist

Use Hasura's GraphQL introspection to enumerate the schema at startup and auto-generate MCP tools only for operations in an explicit allowlist. Skip *_aggregate, *_stream, and *_by_pk where you already have curated equivalents. This works well for data teams who need to expose many tables without hand-writing every tool, but requires curation of the allowlist and manual tool description improvement — auto-generated descriptions from Hasura docstrings are developer-targeted, not LLM-targeted.

Pattern 3: Hasura Actions as MCP tools

Hasura Actions are custom business logic endpoints (REST handlers or serverless functions) that Hasura wraps as GraphQL mutations. They represent intentional, business-level operations rather than raw CRUD — exactly the kind of tool surface that works well for LLMs. charge_subscription, send_order_confirmation, calculate_shipping_estimate are all good candidates for direct MCP tool exposure with minimal curation.

Hasura permissions as MCP authorization

Hasura's role-based permission system uses session variables — x-hasura-role and x-hasura-user-id — passed in request headers to scope every query to that user's rows. If an MCP session has a user auth token, extract the user ID and role, pass them as Hasura session variables in the GraphQL client's headers, and Hasura enforces row-level access control at the database level on every query. You don't need to re-implement access control in each tool handler — Hasura does it for you, consistently, across every operation that reaches the database.

Schema migrations and tool surface drift

Hasura regenerates its GraphQL schema automatically when database migrations run. A column rename (total_centstotal_amount_cents) breaks every MCP tool handler that selects that column in its query. Catch these breakages before production: run MCP tool integration tests as part of your migration pipeline, keep GraphQL queries in named files (not inline gql template literals) so PR diffs surface affected queries, and use hasura migrate status and hasura metadata diff to review schema changes before applying them.

GraphQL subscriptions in MCP: three patterns

GraphQL subscriptions deliver a stream of events over a persistent WebSocket connection. MCP tools are request-response operations invoked one at a time. These models have a fundamental mismatch, and every real-time data integration must confront it. See MCP server GraphQL subscriptions for the full implementation guide including code for all three patterns. Here is the decision framework.

Pattern 1: Poll-and-snapshot (recommended for most cases)

Your MCP server maintains subscriptions server-side using graphql-ws with retryAttempts: Infinity. Incoming events write to an in-memory store (a Map keyed by topic). MCP tools read from the store on demand and return the latest snapshot. The subscription is server-managed — it starts when the process starts, reconnects automatically on disconnect, and is completely decoupled from individual tool calls.

This pattern is simple, robust, and sufficient for most use cases. Its limitation: in-memory state is local to one process instance. If you run multiple MCP server instances for horizontal scaling, each instance has its own event store — a client connected to instance A does not see events buffered by instance B.

Pattern 2: Session-scoped subscription management tools

When subscription data is user-specific — watching one user's notifications, tracking a specific order's status — expose three MCP tools: start_watching_order, check_order_updates, and stop_watching_order. The LLM calls start, polls check on a schedule, and calls stop when done. The MCP server manages per-session subscription state keyed by session ID.

Critical requirement: wire cleanup to the MCP session close event (server.onSessionClose()). If you don't unsubscribe and delete the session state when the session ends, you accumulate orphaned WebSocket connections that leak file descriptors. With high session churn this becomes a serious operational problem within hours.

Pattern 3: Redis Streams as the shared event bus

For multi-instance MCP servers, the subscription handler writes events to a Redis Stream (xAdd) with a trim policy (xTrimByLength, keep last 1,000 entries). MCP tools read from the stream with xRevRange. All instances read from the same Redis Stream, so the client's instance doesn't matter — they all see the same event history.

Redis Streams also survive MCP server restarts without losing event history. After a deployment, the server reconnects the WebSocket subscription and resumes writing to the stream; MCP tools immediately return the full recent history from Redis without a cold-start gap. See MCP server GraphQL schema design for how schema drift affects subscription tool handlers when the event payload shape changes.

When to skip subscriptions entirely

Before building subscription infrastructure, ask whether the use case genuinely requires real-time streaming. Most MCP tool interactions are discrete queries. If the event frequency is less than once per minute, if the upstream API has a since: DateTime query argument, or if your MCP server is stateless (no session close hook), a polling tool is simpler and more reliable. Reserve subscriptions for data that changes at high frequency — multiple times per minute — where polling latency would be noticeable in the user experience.

Schema design for MCP tool compatibility

If you control the GraphQL schema (rather than wrapping a third-party API), there are schema design choices that directly affect how well the auto-generated MCP tool surface works. See MCP server GraphQL schema design for the full reference; the most impactful decisions:

Flatten input types at the GraphQL level

An operation like createOrder(input: OrderInput!) forces you to flatten the nested type at the MCP wrapper layer. If you own the schema, consider defining the operation with scalar arguments instead: createOrder(productId: ID!, quantity: Int!, addressId: ID!). This makes the MCP tool mapping mechanical — no flattening required in the wrapper.

Lowercase enum values

GraphQL convention is ALL_CAPS enum values (FREE, AUTHOR, TEAM). LLMs produce them more reliably as lowercase (free, author, team). If you control the schema, use lowercase enum names. If you don't, convert at the MCP layer: accept lowercase inputs, uppercase them before the GraphQL call.

Union types and polymorphism

A SearchResult union that returns User | Product | Order requires the LLM to interpret __typename and handle three different shapes. Avoid this at the MCP tool layer by either providing type-specific search tools (search_users, search_products) or normalizing the union to a common envelope: { type: "user", id: "...", name: "..." } with consistent outer structure regardless of the concrete type returned.

Schema drift monitoring

Any production GraphQL-backed MCP server should have a startup schema compatibility check: introspect the live API at startup and verify that every field your tool handlers select still exists in the schema. A field removal or argument type change that breaks your queries shows up as tool-level errors, not as MCP protocol failures — initialize and tools/list succeed even when every tool call fails. AliveMCP monitors the tools/list response on every probe and alerts when the tool surface hash changes unexpectedly — catching unintended tool changes from bad deploys before users encounter broken tool calls in their agent pipelines.

Monitoring the two-layer stack

This is the section where most GraphQL-backed MCP servers have a blind spot. The MCP protocol layer and the GraphQL backend layer have different failure modes that require different monitoring strategies. Neither layer can see the other's failures.

LayerWhat can failWhat an HTTP monitor seesWhat you need
MCP protocolServer crash, initialize fails, TLS expired, tools/list empty, protocol regression after bad deployTCP refused or HTTP 5xx (catches only hard failures)External protocol probe that runs the actual initialize + tools/list sequence
GraphQL backendAPI key expired, schema drift, resolver errors, upstream service down, Hasura permission misconfigurationHTTP 200 — MCP server responds normally because initialize and tools/list don't call GraphQLTool-level error rate monitoring from structured logs

The GraphQL layer is the invisible one. Your MCP server's initialize and tools/list endpoints respond correctly even when every GraphQL query behind them is failing — they don't execute queries. External monitoring reports your server as healthy. Users experience errors only when the LLM actually calls a tool.

The three-layer monitoring stack for a production GraphQL-backed MCP server:

Layer 1: External MCP protocol monitoring

AliveMCP sends a real initialize request to your server every 60 seconds and verifies that tools/list returns a non-empty array. This catches the MCP-layer failures: server crashes, TLS expiry, bad deploys that break the initialize handler, refactors that accidentally empty the tool list. If your server is listed on Smithery, Glama, PulseMCP, or other MCP registries, this external monitoring is what protects your listing from being flagged as unhealthy between weekly re-crawl cycles.

Layer 2: Tool-level error rate metrics

Emit a structured log event on every tool call: tool_name, graphql_errors (boolean), duration_ms, success. Alert when the GraphQL error rate for any tool exceeds 5% over a 10-minute window. This catches the GraphQL-layer failures that AliveMCP cannot see: API keys that expired overnight, schema drift from a backend team's migration, Hasura permission misconfigurations that return empty results for every user query.

Layer 3: Subscription health (if using subscriptions)

A dropped WebSocket subscription is invisible to both AliveMCP's protocol probe and your tool-level error metrics — subscription-backed tools return stale cached data, not errors. Track subscription connection state explicitly ("connected" | "disconnected" | "error"), record the timestamp of the last received event, and expose a /healthz/subscriptions endpoint that returns this state. Alert when the subscription has been disconnected for more than 5 minutes or when no events have arrived in longer than the expected event frequency.

For Hasura specifically, add a fourth layer: poll Hasura's /healthz endpoint with an external HTTP monitor. A Hasura process crash or PostgreSQL connection loss is visible to /healthz but not to AliveMCP's MCP protocol probe (which only checks the MCP server process, not the database behind it). See Hasura MCP server for the complete monitoring gap table showing exactly which failures each layer catches.

The monitoring stack for an Apollo Client-backed MCP server has the same structure: AliveMCP for the protocol layer, structured logs for the GraphQL error rate, and Apollo's own error link for development-time observability. Apollo Client's error tracking covers tool-call failures; AliveMCP covers server-level reachability. Neither alone is sufficient. See MCP server with Apollo Client for the complete monitoring gap table showing what Apollo sees vs what AliveMCP sees across six failure modes.

The integration checklist

Before deploying a GraphQL-backed MCP server to production:

  1. Error handling verified. Every tool handler has explicit GraphQL error mapping to isError: true. Null results from expected-non-null queries return isError: true with a descriptive message, not a success response with an embedded "not found" string.
  2. Payload sizes bounded. Every list tool has a limit parameter with a default of 10 and a max of 50. No tool response routinely exceeds 2,000 tokens. Large detail tools are separated from list tools.
  3. Input schemas flattened. No MCP tool accepts a nested object as a parameter. All tool inputs sit at the top level in inputSchema.
  4. Apollo client isolation correct. If any tool returns user-specific data, per-session clients are used, and session cleanup is wired to onSessionClose.
  5. Subscription cleanup confirmed. If using subscriptions, every per-session subscription is unsubscribed on session close. Load test with rapid session open/close to verify no WebSocket leak.
  6. Startup schema check. Server startup validates that all required GraphQL fields still exist in the live schema. Deploy fails fast rather than deploying a server whose tool handlers will silently error.
  7. Two-layer monitoring wired. AliveMCP is monitoring the MCP protocol endpoint. Structured logs per tool call are being emitted. Alert threshold is set for tool-level GraphQL error rate above 5%.

Further reading