Guide · Integration
MCP server gRPC
Teams with existing gRPC microservices want to expose them to AI agents without rewriting service logic. The standard approach is an MCP adapter: a thin Node.js server where each MCP tool handler calls one or more gRPC methods, converts the protobuf response to JSON, and returns it as a tool result. The MCP server is the public interface; the gRPC services are internal implementation details. This pattern lets you add AI-agent access to existing infrastructure incrementally, one service at a time, without touching the gRPC service code.
TL;DR
Create one gRPC channel per service at module scope using @grpc/grpc-js and reuse it across all tool calls. Use @grpc/proto-loader to load .proto files at startup. In tool handlers, call gRPC methods via the client stub and convert the protobuf response to JSON using JSON.stringify (protobuf objects are plain JS objects when using proto-loader). Map gRPC status codes to MCP error patterns: NOT_FOUND, PERMISSION_DENIED, INVALID_ARGUMENT → isError: true; UNAVAILABLE, DEADLINE_EXCEEDED → isError: true with a retry hint; INTERNAL → let it propagate (triggers -32603 and AliveMCP alert). Forward the MCP session ID as a gRPC metadata key for end-to-end tracing. AliveMCP monitors the MCP initialize endpoint — if a gRPC service is down, tool calls return isError: true but the server stays marked as up (initialize still passes). Add a dedicated health_check tool that probes all gRPC dependencies and use a synthetic monitor for end-to-end dependency health.
Setting up gRPC clients at module scope
Create gRPC channels (clients) once at module startup, not inside tool handlers. A gRPC channel maintains a connection pool and handles reconnection automatically — creating a new channel per tool call bypasses the pool and introduces connection overhead on every call:
// grpc-clients.ts — shared gRPC client instances
import * as grpc from '@grpc/grpc-js';
import * as protoLoader from '@grpc/proto-loader';
import { fileURLToPath } from 'node:url';
import { join, dirname } from 'node:path';
const __dirname = dirname(fileURLToPath(import.meta.url));
// Load proto definitions at startup
const documentsDef = protoLoader.loadSync(
join(__dirname, '../proto/documents.proto'),
{ keepCase: true, longs: String, enums: String, defaults: true, oneofs: true }
);
const searchDef = protoLoader.loadSync(
join(__dirname, '../proto/search.proto'),
{ keepCase: true, longs: String, enums: String, defaults: true, oneofs: true }
);
const documentsProto = grpc.loadPackageDefinition(documentsDef) as any;
const searchProto = grpc.loadPackageDefinition(searchDef) as any;
// Create one channel per service — reused across all tool calls
export const documentsClient = new documentsProto.documents.DocumentsService(
process.env.DOCUMENTS_SERVICE_ADDR ?? 'localhost:50051',
grpc.credentials.createInsecure()
);
export const searchClient = new searchProto.search.SearchService(
process.env.SEARCH_SERVICE_ADDR ?? 'localhost:50052',
grpc.credentials.createInsecure()
);
// Graceful shutdown — close channels before process exit
export function closeGrpcClients() {
documentsClient.close();
searchClient.close();
}
Use grpc.credentials.createSsl() in production for services on separate hosts. For internal Kubernetes pod-to-pod calls on the same cluster network without external exposure, createInsecure() is acceptable — TLS is handled at the service mesh layer (Istio, Linkerd) if required.
Tool handler with gRPC call and error mapping
Wrap gRPC calls in a promisified helper that maps gRPC status codes to MCP error patterns. gRPC uses a callback-based API in @grpc/grpc-js, so promisify at the call site or write a thin wrapper:
// grpc-call.ts — promisified gRPC call with error mapping
import * as grpc from '@grpc/grpc-js';
export interface GrpcResult<T> {
data?: T;
isError: boolean;
errorText?: string;
retryable: boolean;
}
export function grpcCall<T>(
fn: (metadata: grpc.Metadata, callback: (err: grpc.ServiceError | null, res: T) => void) => void,
metadata: grpc.Metadata
): Promise<GrpcResult<T>> {
return new Promise((resolve) => {
fn(metadata, (err, res) => {
if (!err) {
resolve({ data: res, isError: false, retryable: false });
return;
}
// Map gRPC status codes to MCP error semantics
const retryable = [
grpc.status.UNAVAILABLE,
grpc.status.DEADLINE_EXCEEDED,
grpc.status.RESOURCE_EXHAUSTED,
].includes(err.code ?? -1);
resolve({
isError: true,
errorText: `${err.message} (gRPC ${err.code})${retryable ? ' — transient, retry in a few seconds' : ''}`,
retryable,
});
});
});
}
// Tool handler using the wrapper
server.tool(
'search_documents',
'Full-text search across all documents',
{ query: z.string().min(1), limit: z.number().int().min(1).max(50).default(10) },
async (args, extra) => {
const meta = new grpc.Metadata();
meta.set('mcp-session-id', extra.meta?.sessionId ?? '');
meta.set('x-request-id', contextStore.getStore()?.requestId ?? '');
const result = await grpcCall<SearchResponse>(
(m, cb) => searchClient.search({ query: args.query, limit: args.limit }, m, cb),
meta
);
if (result.isError) {
return { isError: true, content: [{ type: 'text', text: result.errorText! }] };
}
return { content: [{ type: 'text', text: JSON.stringify(result.data!.results) }] };
}
);
The metadata forwarding (mcp-session-id and x-request-id) propagates the MCP session correlation ID into the gRPC service's logs, enabling end-to-end trace reconstruction across the MCP adapter and the gRPC microservice.
gRPC status code to MCP error mapping
gRPC has 17 status codes. Not all require the same MCP handling:
| gRPC status | MCP handling | Include retry hint? |
|---|---|---|
NOT_FOUND | isError: true — resource missing | No |
ALREADY_EXISTS | isError: true — duplicate operation | No |
INVALID_ARGUMENT | isError: true — bad input (Zod should catch this first) | No |
PERMISSION_DENIED | isError: true — auth failure | No |
UNAUTHENTICATED | isError: true — missing/expired credential | No |
RESOURCE_EXHAUSTED | isError: true — rate limited | Yes (back off) |
UNAVAILABLE | isError: true — service down, transient | Yes (retry) |
DEADLINE_EXCEEDED | isError: true — timeout | Yes (retry or reduce scope) |
INTERNAL | Propagate as exception → -32603 | N/A — unexpected state |
UNIMPLEMENTED | Propagate as exception — deployment mismatch | N/A |
The key distinction: UNAVAILABLE and DEADLINE_EXCEEDED are transient — return isError: true with a retry hint so the AI client can retry without human intervention. INTERNAL is unexpected state that signals a bug in the gRPC service — let it propagate as an unhandled exception so your global exception handler logs it at error level and AliveMCP's tool-error-rate alert fires.
Monitoring a gRPC-backed MCP server
AliveMCP probes initialize + tools/list — both succeed even when downstream gRPC services are down. This means AliveMCP correctly reports the MCP server as up when gRPC backends are degraded. To monitor end-to-end health (MCP layer + gRPC dependencies), add a dedicated health_check tool that probes all dependencies and configure AliveMCP or a synthetic monitor to call it:
server.tool(
'health_check',
'Check the health of all downstream gRPC dependencies',
{},
async () => {
const checks = await Promise.allSettled([
grpcCall((m, cb) => documentsClient.ping({}, m, cb), new grpc.Metadata()),
grpcCall((m, cb) => searchClient.ping({}, m, cb), new grpc.Metadata()),
]);
const results = checks.map((r, i) => ({
service: ['documents', 'search'][i],
healthy: r.status === 'fulfilled' && !r.value.isError,
error: r.status === 'rejected' ? String(r.reason) :
r.value.isError ? r.value.errorText : undefined,
}));
const allHealthy = results.every(r => r.healthy);
return {
isError: !allHealthy,
content: [{ type: 'text', text: JSON.stringify({ status: allHealthy ? 'ok' : 'degraded', checks: results }) }],
};
}
);
Configure AliveMCP to call health_check on a schedule using the tool-call probe feature. A failing health_check tool result (with isError: true) changes the server's status to degraded in the AliveMCP dashboard and triggers your configured alerts — alerting you to a gRPC backend failure before users hit tool errors.
Related questions
Should I use gRPC-web for the MCP-to-gRPC connection?
No. gRPC-web is a browser-to-gRPC proxy protocol designed for environments that can't use HTTP/2 directly. Your MCP server (a Node.js process) can use the full @grpc/grpc-js library over HTTP/2 directly — no proxy needed. gRPC-web adds latency and complexity for server-to-server calls. Use @grpc/grpc-js directly in your MCP adapter.
How do I handle gRPC streaming responses (server streaming, bidirectional)?
For server-streaming gRPC calls, collect all chunks before returning the tool result, or use MCP progress notifications to stream them. The simplest pattern is to collect all chunks into an array inside the tool handler and return the full array as JSON. If the stream is too large to buffer, use MCP's notifications/progress to send chunks as they arrive, and return an empty final result. See the streaming guide for the progress notification pattern.
Can I use gRPC service reflection to auto-generate MCP tools?
Yes, with caveats. gRPC server reflection lets you query a running service for its proto definitions. In principle, you could enumerate all RPCs and auto-register them as MCP tools. In practice, auto-generated tool names and descriptions are poor — AI clients need human-readable descriptions to call tools correctly. Use reflection for discovery during development, but write explicit server.tool() registrations with manually crafted descriptions for tools you intend for production use.
Does the gRPC connection add to the MCP server's initialize latency?
No. The gRPC channel is created at module scope (before app.listen) and connects lazily on the first call. The initialize handshake only registers tools and returns server metadata — it does not make any gRPC calls. AliveMCP's initialize probe latency reflects the MCP layer only. gRPC call latency appears in tool-call structured logs (duration_ms per tool invocation), not in probe metrics.
Further reading
- MCP server error handling — isError vs thrown exceptions and retry classification
- MCP server timeout — deadline propagation from MCP to gRPC calls
- MCP server streaming — using progress notifications for gRPC server-streaming responses
- MCP server logging — propagating session_id into gRPC metadata for end-to-end tracing
- MCP server health check — adding dependency health to the probe surface
- AliveMCP — uptime monitoring for MCP servers with gRPC backend dependency tracking