Guide · MCP Resilience
MCP server idempotency
When an LLM agent calls a tool and the network drops before the response arrives, the agent retries. If your tool is not idempotent — meaning the same call executed twice produces the same result as once — that retry sends a second email, charges a card twice, or creates a duplicate database record. Idempotency is the practice of giving every tool call a unique key so that re-execution is recognized and suppressed. It is the difference between a safe retry loop and a data integrity disaster.
TL;DR
Accept an idempotencyKey argument on every tool with side effects. On first call, execute the operation and store the response keyed on clientId:toolName:idempotencyKey in Redis (or SQLite) with a 24-hour TTL. On repeat calls with the same key, return the stored response without re-executing. Return a duplicate: true field so the agent knows the result is cached. Never re-execute on a duplicate — even if the stored response is an error.
Why MCP tool calls need idempotency
HTTP APIs have long used idempotency keys (Stripe pioneered the pattern), but MCP tool calls face a more demanding version of the problem:
- Agent retry loops are automatic — agents built on frameworks like LangChain, AutoGen, or the Claude tool use API retry on any non-5xx timeout or network error, often 3–5 times before giving up. Unlike a human who might notice a duplicate charge, the agent has no awareness of side effects.
- MCP sessions can resume across restarts — a long-running agentic task may be checkpointed and resumed after a process restart, replaying tool calls from the last checkpoint without the agent knowing they already executed.
- SSE transport has no built-in acknowledgment — if the server sends the tool result over an SSE stream but the connection closes before the client reads it, the agent cannot distinguish "never executed" from "executed but response lost". It retries. Without idempotency, the operation runs twice.
- Distributed callbacks — if your tool triggers an async job (queue message, webhook dispatch), the job may complete even when the HTTP response fails. A retry re-queues the same work.
Which tools need idempotency
Not every tool needs an idempotency layer. Use the MCP tool annotations (readOnlyHint) to classify your tools and apply the idempotency pattern only where it matters:
| Tool type | readOnlyHint | Idempotency needed? |
|---|---|---|
| Query / read-only | true | No — reads have no side effects to deduplicate |
| Idempotent by nature (HTTP PUT) | false | Optional — but still useful to avoid redundant work |
| Non-idempotent writes (HTTP POST, send email, charge card) | false | Yes — critical |
| Destructive operations (delete, drop, archive) | false | Yes — double-delete must be detected and suppressed |
Idempotency key design
An idempotency key must uniquely identify a logical operation. Two strategies:
Client-generated UUID — the agent generates a UUID before calling the tool and passes it as an argument. Simple and reliable. The agent can persist the UUID in its state so that if the task is replayed from a checkpoint, the same UUID is reused and the tool call is recognized as a duplicate.
// Agent side (TypeScript agent using MCP client)
import { randomUUID } from 'crypto';
const idempotencyKey = randomUUID(); // generated once per logical operation
await mcpClient.callTool('send_invoice', {
customerId: 'cus_abc123',
amountCents: 4900,
idempotencyKey,
});
// If the agent retries, it REUSES the same idempotencyKey
// The server returns the cached result — no second invoice sent
Operation hash — hash the tool name and deterministic arguments together. Works when the agent cannot be modified to pass explicit keys. Less reliable because argument objects with floating-point timestamps or random fields defeat deduplication.
import { createHash } from 'crypto';
function hashKey(toolName: string, args: Record<string, unknown>): string {
// Only hash stable, deterministic fields — exclude timestamps and random IDs
const stable = { tool: toolName, ...args };
return createHash('sha256').update(JSON.stringify(stable)).digest('hex').slice(0, 32);
}
Prefer client-generated UUIDs. Hash-based keys are a fallback when the caller cannot be modified.
Deduplication storage
Store idempotency records in a fast key-value store with TTL support. Redis is the standard choice for HTTP-transport MCP servers; SQLite works for single-process servers.
Redis implementation
import { createClient } from 'redis';
const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();
const IDEMPOTENCY_TTL_SECONDS = 86400; // 24 hours
interface IdempotencyRecord {
status: 'in_flight' | 'complete';
response?: unknown;
error?: string;
createdAt: string;
}
async function withIdempotency<T>(
clientId: string,
toolName: string,
idempotencyKey: string,
execute: () => Promise<T>
): Promise<{ result: T; duplicate: boolean }> {
const storeKey = `idempotency:${clientId}:${toolName}:${idempotencyKey}`;
// Check for existing record
const existing = await redis.get(storeKey);
if (existing) {
const record = JSON.parse(existing) as IdempotencyRecord;
// If still in_flight, another request is executing — wait or reject
if (record.status === 'in_flight') {
throw new Error('Concurrent request with same idempotency key is still in progress');
}
// Return stored response without re-executing
if (record.error) throw new Error(record.error);
return { result: record.response as T, duplicate: true };
}
// Mark as in_flight to block concurrent duplicates
const inFlight: IdempotencyRecord = { status: 'in_flight', createdAt: new Date().toISOString() };
await redis.set(storeKey, JSON.stringify(inFlight), { EX: IDEMPOTENCY_TTL_SECONDS });
try {
const result = await execute();
const complete: IdempotencyRecord = {
status: 'complete',
response: result,
createdAt: inFlight.createdAt,
};
await redis.set(storeKey, JSON.stringify(complete), { EX: IDEMPOTENCY_TTL_SECONDS });
return { result, duplicate: false };
} catch (err) {
// Store error response so duplicates also receive the error — not a retry opportunity
const errRecord: IdempotencyRecord = {
status: 'complete',
error: err instanceof Error ? err.message : String(err),
createdAt: inFlight.createdAt,
};
await redis.set(storeKey, JSON.stringify(errRecord), { EX: IDEMPOTENCY_TTL_SECONDS });
throw err;
}
}
Tool integration
server.tool(
'send_invoice',
'Send an invoice email to a customer',
{
customerId: z.string(),
amountCents: z.number().int().positive(),
idempotencyKey: z.string().uuid().describe('Client-generated UUID — reuse on retry to prevent duplicate sends'),
},
async ({ customerId, amountCents, idempotencyKey }, context) => {
const clientId = context.actor?.id ?? 'anonymous';
const { result, duplicate } = await withIdempotency(
clientId,
'send_invoice',
idempotencyKey,
() => invoiceService.send(customerId, amountCents)
);
return {
content: [{
type: 'text',
text: JSON.stringify({ ...result, duplicate }),
}],
};
}
);
TTL window selection
The idempotency TTL determines how long duplicate protection holds. Choose based on your agent's retry behavior:
| Scenario | Recommended TTL | Rationale |
|---|---|---|
| Interactive agent (human in loop) | 1 hour | Human confirms before long-running retries |
| Automated agent task | 24 hours | Covers task checkpoint / resume cycles within a day |
| Batch processing job | 7 days | Jobs may be retried days after initial failure |
| Financial transactions | 30 days | Chargeback window; regulation may require longer |
Do not set the TTL so long that you accumulate gigabytes of Redis keys. For financial workloads with retention requirements, archive expired idempotency records to a durable store before they expire from Redis.
Handling errors idempotently
Store error responses, not just successes. If the first execution fails (e.g., payment gateway down), the idempotency record should store the error. Subsequent duplicates within the TTL window return the same error rather than retrying the operation.
This is counterintuitive but important: idempotency keys deduplicate attempts, not outcomes. If the agent wants to retry after a transient error, it must generate a new idempotency key. This forces explicit intent — the agent acknowledges it is making a new attempt, not replaying an old one.
The exception is the in_flight state: if the server crashes while executing, the record stays in_flight indefinitely until it expires. Design your agent to treat an in_flight duplicate as a transient error and retry with a new key after a reasonable wait (e.g., 60 seconds).
Idempotency and agent checkpointing
Modern agent frameworks checkpoint their state so long-running tasks can resume after process restarts. When a task resumes, it replays tool calls from the last checkpoint. For this to be safe, every tool call must include an idempotency key that was generated before the checkpoint was saved:
- Agent generates
idempotencyKey = uuid() - Agent saves the key to its checkpoint state
- Agent calls the tool with the key
- If the process restarts and the task resumes from step 2, the same key is loaded from the checkpoint and reused in step 3
- Server recognizes the duplicate and returns the cached response
This pattern guarantees at-most-once execution of side effects across process restarts, network failures, and agent retry loops simultaneously.
Monitoring idempotency in production
Track your duplicate rate as an operational metric. A high duplicate rate indicates excessive retries, which could signal that your server is responding slowly or returning errors that the agent considers transient. Emit the duplicate flag in your audit log so you can query it:
-- Duplicate rate by tool over the last 24 hours
SELECT tool,
COUNT(*) AS total_calls,
SUM(CASE WHEN args->>'duplicate' = 'true' THEN 1 ELSE 0 END) AS duplicate_calls,
ROUND(100.0 * SUM(CASE WHEN args->>'duplicate' = 'true' THEN 1 ELSE 0 END) / COUNT(*), 1) AS duplicate_pct
FROM audit_log
WHERE timestamp > datetime('now', '-1 day')
GROUP BY tool
ORDER BY duplicate_pct DESC;
If a tool consistently shows >5% duplicate rate, investigate: slow response times, agent misconfiguration, or a bug in the idempotency key lifecycle are the most common causes. AliveMCP external probes detect slowness before it drives the agent's retry behavior — pair uptime monitoring with this metric.
Further reading
- MCP server retry logic — exponential backoff and jitter
- MCP server circuit breaker — fast-fail on known-broken dependencies
- MCP server error handling — structured error types and recovery
- MCP server audit logging — capture, redact, and query tool call records
- MCP server authentication — JWT, API keys, and session verification
- MCP server Redis — caching and session storage patterns
- MCP tool annotations — readOnlyHint, destructiveHint, and idempotency
- MCP server reliability — SLOs, error budgets, and uptime targets
- AliveMCP — uptime monitoring for HTTP-deployed MCP servers