Guide · MCP Tool Implementation

MCP server API wrapper tools

The most common MCP server pattern is wrapping an existing REST API — Stripe, GitHub, Slack, Notion, Salesforce, your own backend service — and exposing it as typed MCP tools that an LLM can call directly. The MCP layer handles auth injection, argument validation, response transformation, error mapping, rate limiting, and retry logic so the LLM sees a clean, well-described interface rather than raw HTTP. This guide covers the core patterns for building production-quality API wrapper MCP servers.

TL;DR

Inject API keys from environment variables at the server level — never accept them as tool arguments (LLMs leak secrets through logs and context). Use a shared fetch wrapper that handles authentication headers, timeout, response parsing, and HTTP error mapping for every outbound call. Write one MCP tool per API operation, not one tool per API — create_github_issue beats call_github_api with a raw endpoint argument. Add a circuit breaker so a sustained outage of the upstream API returns fast isError responses instead of queuing requests that all time out.

Authentication injection: keep secrets at the server level

API keys must never be passed as tool arguments. When an LLM calls get_github_issues({ api_key: "ghp_...", repo: "..." }), the key appears in the tool call, gets logged by the MCP client, may be included in context sent to other models, and is visible in Claude Desktop's activity panel. Server-level injection keeps the secret in the process environment, invisible to the LLM:

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';

const server = new McpServer({ name: 'github-mcp', version: '1.0.0' });

// Inject once at startup — LLM never sees these
const GITHUB_TOKEN = process.env.GITHUB_TOKEN;
const GITHUB_API_BASE = 'https://api.github.com';
const REQUEST_TIMEOUT_MS = 15_000;

// Shared fetch wrapper for all GitHub API calls
async function githubFetch(path: string, options: RequestInit = {}): Promise<Response> {
  if (!GITHUB_TOKEN) throw new Error('GITHUB_TOKEN environment variable not set');

  const res = await fetch(`${GITHUB_API_BASE}${path}`, {
    ...options,
    headers: {
      'Authorization': `Bearer ${GITHUB_TOKEN}`,
      'Accept': 'application/vnd.github+json',
      'X-GitHub-Api-Version': '2022-11-28',
      'Content-Type': 'application/json',
      ...options.headers,
    },
    signal: AbortSignal.timeout(REQUEST_TIMEOUT_MS),
  });

  return res;
}

If your API wrapper serves multiple users (multi-tenant MCP server), store per-user API keys in a secrets store keyed by the user's session token — never hardcode multi-tenant credentials into the MCP server binary.

Per-operation tools with typed schemas

Define one MCP tool per API operation rather than a single generic "call API" tool. Specific tools give the LLM a typed schema to fill in, produce cleaner logs, and make it easier to add per-operation rate limits and circuit breakers:

server.tool(
  'list_github_issues',
  'List open issues in a GitHub repository',
  {
    owner: z.string().describe('Repository owner (user or org)'),
    repo: z.string().describe('Repository name'),
    state: z.enum(['open', 'closed', 'all']).default('open'),
    labels: z.string().optional().describe('Comma-separated label filter'),
    per_page: z.number().int().min(1).max(100).default(30),
  },
  async ({ owner, repo, state, labels, per_page }) => {
    const params = new URLSearchParams({ state, per_page: String(per_page) });
    if (labels) params.set('labels', labels);

    const res = await githubFetch(`/repos/${encodeURIComponent(owner)}/${encodeURIComponent(repo)}/issues?${params}`);

    if (!res.ok) return mapGithubError(res);

    const issues = await res.json() as Array<{ number: number; title: string; state: string; labels: Array<{ name: string }>; created_at: string }>;
    if (issues.length === 0) return { content: [{ type: 'text', text: 'No issues found.' }] };

    const text = issues.map(i =>
      `#${i.number} [${i.state}] ${i.title}\n  Labels: ${i.labels.map(l => l.name).join(', ') || 'none'}  Created: ${i.created_at.split('T')[0]}`
    ).join('\n\n');

    return { content: [{ type: 'text', text: text }] };
  }
);

server.tool(
  'create_github_issue',
  'Create a new issue in a GitHub repository',
  {
    owner: z.string(),
    repo: z.string(),
    title: z.string().min(1).max(256),
    body: z.string().max(65_536).optional(),
    labels: z.array(z.string()).max(10).default([]),
  },
  async ({ owner, repo, title, body, labels }) => {
    const res = await githubFetch(
      `/repos/${encodeURIComponent(owner)}/${encodeURIComponent(repo)}/issues`,
      { method: 'POST', body: JSON.stringify({ title, body, labels }) }
    );
    if (!res.ok) return mapGithubError(res);
    const issue = await res.json() as { number: number; html_url: string };
    return { content: [{ type: 'text', text: `Created issue #${issue.number}: ${issue.html_url}` }] };
  }
);

Error mapping: translate HTTP errors to LLM-readable messages

Raw HTTP error responses — {"message":"Not Found","documentation_url":"..."} — are not immediately useful to an LLM trying to decide what to do next. Map common HTTP status codes to actionable guidance:

async function mapGithubError(res: Response): Promise<{ isError: true; content: Array<{ type: 'text'; text: string }> }> {
  let body: Record<string, unknown> = {};
  try { body = await res.json(); } catch { /* body may not be JSON */ }

  const hint = {
    401: 'GITHUB_TOKEN is missing or expired — check the server environment variable.',
    403: body.message?.toString().includes('rate limit')
      ? `GitHub API rate limit hit. Resets at: ${res.headers.get('X-RateLimit-Reset') ? new Date(Number(res.headers.get('X-RateLimit-Reset')) * 1000).toISOString() : 'unknown'}`
      : 'Access denied — check repository permissions for the token.',
    404: 'Repository or resource not found — check owner and repo name spelling.',
    422: `Validation failed: ${JSON.stringify(body.errors ?? body.message)}`,
    429: 'Secondary rate limit triggered. Wait a few minutes before retrying.',
    500: 'GitHub API internal server error — this is a GitHub-side issue, not your request.',
  }[res.status] ?? `HTTP ${res.status}: ${body.message ?? res.statusText}`;

  return { isError: true, content: [{ type: 'text', text: hint }] };
}

Error messages should tell the LLM what went wrong AND what to do about it. "401 Unauthorized" is useless; "GITHUB_TOKEN is expired — ask the user to provide a new token" gives the LLM a recovery path.

Per-API rate limiting

MCP tools can be called in rapid loops by an LLM reasoning through a multi-step task. Without rate limiting, an agent that calls list_github_issues in a loop can exhaust your GitHub API quota in seconds. Add a token bucket or sliding window limiter per API:

class RateLimiter {
  private tokens: number;
  private lastRefill: number;

  constructor(private readonly limit: number, private readonly windowMs: number) {
    this.tokens = limit;
    this.lastRefill = Date.now();
  }

  async acquire(): Promise<void> {
    const now = Date.now();
    const elapsed = now - this.lastRefill;
    const refill = Math.floor((elapsed / this.windowMs) * this.limit);
    if (refill > 0) {
      this.tokens = Math.min(this.limit, this.tokens + refill);
      this.lastRefill = now;
    }
    if (this.tokens <= 0) {
      const waitMs = this.windowMs - elapsed;
      await new Promise(r => setTimeout(r, waitMs));
      this.tokens = this.limit;
      this.lastRefill = Date.now();
    }
    this.tokens--;
  }
}

// GitHub allows 5000 requests/hour with a token
const githubLimiter = new RateLimiter(60, 60_000); // 60 requests per minute (conservative)

// Wrap githubFetch to go through the limiter
async function rateLimitedGithubFetch(path: string, options?: RequestInit): Promise<Response> {
  await githubLimiter.acquire();
  return githubFetch(path, options);
}

Circuit breaker for sustained outages

When an upstream API is down, every tool call that hits it waits for the full request timeout (15 seconds) before failing. An LLM retrying 5 times means 75 seconds of blocked calls. A circuit breaker detects sustained failures and fails fast, returning an immediate isError response until the API recovers:

type CircuitState = 'closed' | 'open' | 'half-open';

class CircuitBreaker {
  private state: CircuitState = 'closed';
  private failures = 0;
  private lastFailureAt = 0;

  constructor(
    private readonly failureThreshold = 5,
    private readonly recoverMs = 30_000
  ) {}

  async call<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === 'open') {
      if (Date.now() - this.lastFailureAt < this.recoverMs) {
        throw new Error('Service temporarily unavailable (circuit open). Try again in a moment.');
      }
      this.state = 'half-open';
    }
    try {
      const result = await fn();
      this.failures = 0;
      this.state = 'closed';
      return result;
    } catch (e) {
      this.failures++;
      this.lastFailureAt = Date.now();
      if (this.failures >= this.failureThreshold) this.state = 'open';
      throw e;
    }
  }
}

const githubCircuit = new CircuitBreaker(5, 30_000);

// Use in tools:
const res = await githubCircuit.call(() => rateLimitedGithubFetch(`/repos/${owner}/${repo}/issues`));

Generalized wrapper for multi-API servers

If your MCP server wraps many APIs, extract the shared fetch logic into a factory function to avoid repeating auth and error handling boilerplate:

interface ApiClientConfig {
  baseUrl: string;
  authHeader: () => string;   // function to allow token refresh
  timeoutMs?: number;
  rateLimitPerMinute?: number;
}

function createApiClient({ baseUrl, authHeader, timeoutMs = 15_000, rateLimitPerMinute = 60 }: ApiClientConfig) {
  const limiter = new RateLimiter(rateLimitPerMinute, 60_000);
  const circuit = new CircuitBreaker();

  return async function apiFetch(path: string, init: RequestInit = {}): Promise<Response> {
    await limiter.acquire();
    return circuit.call(() => fetch(`${baseUrl}${path}`, {
      ...init,
      headers: { 'Authorization': authHeader(), 'Content-Type': 'application/json', ...init.headers },
      signal: AbortSignal.timeout(timeoutMs),
    }));
  };
}

// One line to configure a new API client:
const stripeApi = createApiClient({
  baseUrl: 'https://api.stripe.com/v1',
  authHeader: () => `Bearer ${process.env.STRIPE_SECRET_KEY}`,
});

Monitoring API wrapper MCP servers

API wrapper servers fail when the upstream API changes — a deprecated endpoint returns 410 Gone, an API key expires and every call returns 401, a rate limit resets to a lower tier. These failures are invisible at the MCP transport layer: initialize succeeds, tools/list returns all tools, but every tools/call returns isError: true. An LLM working through a multi-step task silently fails on every tool call.

Add a health check tool that makes a lightweight canary call to the upstream API (GET /rate_limit for GitHub, GET /v1/balance for Stripe) and verify it succeeds. AliveMCP probes your MCP endpoint every 60 seconds using the full protocol handshake, detecting handler-level failures that transport-only health checks miss — including upstream API outages that surface as isError in tool calls but not as HTTP failures on /health.

Further reading