Guide · MCP Resilience

MCP server idempotency

When an LLM agent calls a tool and the network drops before the response arrives, the agent retries. If your tool is not idempotent — meaning the same call executed twice produces the same result as once — that retry sends a second email, charges a card twice, or creates a duplicate database record. Idempotency is the practice of giving every tool call a unique key so that re-execution is recognized and suppressed. It is the difference between a safe retry loop and a data integrity disaster.

TL;DR

Accept an idempotencyKey argument on every tool with side effects. On first call, execute the operation and store the response keyed on clientId:toolName:idempotencyKey in Redis (or SQLite) with a 24-hour TTL. On repeat calls with the same key, return the stored response without re-executing. Return a duplicate: true field so the agent knows the result is cached. Never re-execute on a duplicate — even if the stored response is an error.

Why MCP tool calls need idempotency

HTTP APIs have long used idempotency keys (Stripe pioneered the pattern), but MCP tool calls face a more demanding version of the problem:

Agent retry loops are automatic — agents built on frameworks like LangChain, AutoGen, or the Claude tool use API retry on any non-5xx timeout or network error, often 3–5 times before giving up. Unlike a human who might notice a duplicate charge, the agent has no awareness of side effects.
MCP sessions can resume across restarts — a long-running agentic task may be checkpointed and resumed after a process restart, replaying tool calls from the last checkpoint without the agent knowing they already executed.
SSE transport has no built-in acknowledgment — if the server sends the tool result over an SSE stream but the connection closes before the client reads it, the agent cannot distinguish "never executed" from "executed but response lost". It retries. Without idempotency, the operation runs twice.
Distributed callbacks — if your tool triggers an async job (queue message, webhook dispatch), the job may complete even when the HTTP response fails. A retry re-queues the same work.

Which tools need idempotency

Not every tool needs an idempotency layer. Use the MCP tool annotations (readOnlyHint) to classify your tools and apply the idempotency pattern only where it matters:

Tool type	readOnlyHint	Idempotency needed?
Query / read-only	true	No — reads have no side effects to deduplicate
Idempotent by nature (HTTP PUT)	false	Optional — but still useful to avoid redundant work
Non-idempotent writes (HTTP POST, send email, charge card)	false	Yes — critical
Destructive operations (delete, drop, archive)	false	Yes — double-delete must be detected and suppressed

Idempotency key design

An idempotency key must uniquely identify a logical operation. Two strategies:

Client-generated UUID — the agent generates a UUID before calling the tool and passes it as an argument. Simple and reliable. The agent can persist the UUID in its state so that if the task is replayed from a checkpoint, the same UUID is reused and the tool call is recognized as a duplicate.

// Agent side (TypeScript agent using MCP client)
import { randomUUID } from 'crypto';

const idempotencyKey = randomUUID(); // generated once per logical operation
await mcpClient.callTool('send_invoice', {
  customerId: 'cus_abc123',
  amountCents: 4900,
  idempotencyKey,
});

// If the agent retries, it REUSES the same idempotencyKey
// The server returns the cached result — no second invoice sent

Operation hash — hash the tool name and deterministic arguments together. Works when the agent cannot be modified to pass explicit keys. Less reliable because argument objects with floating-point timestamps or random fields defeat deduplication.

import { createHash } from 'crypto';

function hashKey(toolName: string, args: Record<string, unknown>): string {
  // Only hash stable, deterministic fields — exclude timestamps and random IDs
  const stable = { tool: toolName, ...args };
  return createHash('sha256').update(JSON.stringify(stable)).digest('hex').slice(0, 32);
}

Prefer client-generated UUIDs. Hash-based keys are a fallback when the caller cannot be modified.

Deduplication storage

Store idempotency records in a fast key-value store with TTL support. Redis is the standard choice for HTTP-transport MCP servers; SQLite works for single-process servers.

Redis implementation

import { createClient } from 'redis';

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

const IDEMPOTENCY_TTL_SECONDS = 86400; // 24 hours

interface IdempotencyRecord {
  status: 'in_flight' | 'complete';
  response?: unknown;
  error?: string;
  createdAt: string;
}

async function withIdempotency<T>(
  clientId: string,
  toolName: string,
  idempotencyKey: string,
  execute: () => Promise<T>
): Promise<{ result: T; duplicate: boolean }> {
  const storeKey = `idempotency:${clientId}:${toolName}:${idempotencyKey}`;

  // Check for existing record
  const existing = await redis.get(storeKey);
  if (existing) {
    const record = JSON.parse(existing) as IdempotencyRecord;

    // If still in_flight, another request is executing — wait or reject
    if (record.status === 'in_flight') {
      throw new Error('Concurrent request with same idempotency key is still in progress');
    }

    // Return stored response without re-executing
    if (record.error) throw new Error(record.error);
    return { result: record.response as T, duplicate: true };
  }

  // Mark as in_flight to block concurrent duplicates
  const inFlight: IdempotencyRecord = { status: 'in_flight', createdAt: new Date().toISOString() };
  await redis.set(storeKey, JSON.stringify(inFlight), { EX: IDEMPOTENCY_TTL_SECONDS });

  try {
    const result = await execute();
    const complete: IdempotencyRecord = {
      status: 'complete',
      response: result,
      createdAt: inFlight.createdAt,
    };
    await redis.set(storeKey, JSON.stringify(complete), { EX: IDEMPOTENCY_TTL_SECONDS });
    return { result, duplicate: false };
  } catch (err) {
    // Store error response so duplicates also receive the error — not a retry opportunity
    const errRecord: IdempotencyRecord = {
      status: 'complete',
      error: err instanceof Error ? err.message : String(err),
      createdAt: inFlight.createdAt,
    };
    await redis.set(storeKey, JSON.stringify(errRecord), { EX: IDEMPOTENCY_TTL_SECONDS });
    throw err;
  }
}

Tool integration

server.tool(
  'send_invoice',
  'Send an invoice email to a customer',
  {
    customerId: z.string(),
    amountCents: z.number().int().positive(),
    idempotencyKey: z.string().uuid().describe('Client-generated UUID — reuse on retry to prevent duplicate sends'),
  },
  async ({ customerId, amountCents, idempotencyKey }, context) => {
    const clientId = context.actor?.id ?? 'anonymous';
    const { result, duplicate } = await withIdempotency(
      clientId,
      'send_invoice',
      idempotencyKey,
      () => invoiceService.send(customerId, amountCents)
    );

    return {
      content: [{
        type: 'text',
        text: JSON.stringify({ ...result, duplicate }),
      }],
    };
  }
);

TTL window selection

The idempotency TTL determines how long duplicate protection holds. Choose based on your agent's retry behavior:

Scenario	Recommended TTL	Rationale
Interactive agent (human in loop)	1 hour	Human confirms before long-running retries
Automated agent task	24 hours	Covers task checkpoint / resume cycles within a day
Batch processing job	7 days	Jobs may be retried days after initial failure
Financial transactions	30 days	Chargeback window; regulation may require longer

Do not set the TTL so long that you accumulate gigabytes of Redis keys. For financial workloads with retention requirements, archive expired idempotency records to a durable store before they expire from Redis.

Handling errors idempotently

Store error responses, not just successes. If the first execution fails (e.g., payment gateway down), the idempotency record should store the error. Subsequent duplicates within the TTL window return the same error rather than retrying the operation.

This is counterintuitive but important: idempotency keys deduplicate attempts, not outcomes. If the agent wants to retry after a transient error, it must generate a new idempotency key. This forces explicit intent — the agent acknowledges it is making a new attempt, not replaying an old one.

The exception is the in_flight state: if the server crashes while executing, the record stays in_flight indefinitely until it expires. Design your agent to treat an in_flight duplicate as a transient error and retry with a new key after a reasonable wait (e.g., 60 seconds).

Idempotency and agent checkpointing

Modern agent frameworks checkpoint their state so long-running tasks can resume after process restarts. When a task resumes, it replays tool calls from the last checkpoint. For this to be safe, every tool call must include an idempotency key that was generated before the checkpoint was saved:

Agent generates idempotencyKey = uuid()
Agent saves the key to its checkpoint state
Agent calls the tool with the key
If the process restarts and the task resumes from step 2, the same key is loaded from the checkpoint and reused in step 3
Server recognizes the duplicate and returns the cached response

This pattern guarantees at-most-once execution of side effects across process restarts, network failures, and agent retry loops simultaneously.

Monitoring idempotency in production

Track your duplicate rate as an operational metric. A high duplicate rate indicates excessive retries, which could signal that your server is responding slowly or returning errors that the agent considers transient. Emit the duplicate flag in your audit log so you can query it:

-- Duplicate rate by tool over the last 24 hours
SELECT tool,
       COUNT(*) AS total_calls,
       SUM(CASE WHEN args->>'duplicate' = 'true' THEN 1 ELSE 0 END) AS duplicate_calls,
       ROUND(100.0 * SUM(CASE WHEN args->>'duplicate' = 'true' THEN 1 ELSE 0 END) / COUNT(*), 1) AS duplicate_pct
FROM audit_log
WHERE timestamp > datetime('now', '-1 day')
GROUP BY tool
ORDER BY duplicate_pct DESC;

If a tool consistently shows >5% duplicate rate, investigate: slow response times, agent misconfiguration, or a bug in the idempotency key lifecycle are the most common causes. AliveMCP external probes detect slowness before it drives the agent's retry behavior — pair uptime monitoring with this metric.