Guide · MCP Resilience
MCP server schema evolution
MCP tool schemas define the interface between your server and every agent that uses it. Unlike a REST API where you can pin clients to a specific version, MCP clients discover tools dynamically and generate calls based on the schema they see at runtime. When you change a tool's parameters — rename a field, tighten a constraint, remove an option — agents that cached the old schema break silently: they generate calls that your server now rejects. Schema evolution for MCP servers requires the same discipline as database schema migrations: plan every change for backward compatibility, deprecate rather than delete, and version explicitly when a clean break is unavoidable.
TL;DR
Only make additive changes to existing tool schemas: add optional parameters (never required ones), expand enum values (never remove), widen type constraints (never narrow). To deprecate a parameter, mark it deprecated: true in its description and keep it optional for at least one major version. When a breaking change is unavoidable, add a new versioned tool (search_v2) alongside the old one and remove the old only after confirming no active agents use it.
Why schema changes break agents differently from API changes
The MCP tool discovery model creates a specific fragility that API versioning strategies do not address:
- LLM-generated calls — agents do not call tools by writing code. The LLM reads the tool schema from
tools/listand generates argument objects based on the schema description and its own inference. A renamed parameter does not appear in the LLM's tool call unless the schema explicitly shows it. - Prompt caching — agent systems often cache the system prompt containing tool schemas for cost efficiency. An agent running with a cached schema from 6 hours ago will not see your parameter rename until its cache expires.
- Multi-agent systems — in a pipeline where agent A passes results to agent B, a schema change on the upstream tool may invalidate the structured output that B depends on, causing cascading failures across agents that have no explicit version dependency.
- No redeploy coupling — you can redeploy your server immediately; agent clients are deployed independently and cannot be force-upgraded. Some will use the old schema for days or weeks.
Safe vs breaking changes
| Change type | Safe? | Reason |
|---|---|---|
| Add optional parameter with default | Yes | Old agents omit the parameter; default ensures valid behavior |
| Expand enum (add new value) | Yes | Old agents never generate the new value; new behavior is opt-in |
Widen a constraint (max: 100 → max: 1000) | Yes | Old agent calls still valid; new range only accessible by updated agents |
| Improve parameter description text | Yes | Better LLM prompting; no call structure change |
| Add new return field to response | Yes | Agents ignore unknown fields |
| Add required parameter | Breaking | Old agents omit it; server returns validation error |
| Remove parameter | Breaking | Old agents pass it; server either errors or ignores (unexpected behavior) |
| Rename parameter | Breaking | Old agents use old name; new name absent; validation fails |
Narrow constraint (max: 1000 → max: 100) | Breaking | Old agents may pass values above new max |
| Remove enum value | Breaking | Old agents may pass the removed value |
Change parameter type (string → number) | Breaking | Old agents generate wrong type; validation fails immediately |
Deprecating parameters
When you want to remove a parameter, deprecate it first. Keep it optional in the schema with a description that signals deprecation and names the replacement:
import { z } from 'zod';
server.tool(
'search_records',
'Search customer records by query string',
{
query: z.string().describe('Search query'),
limit: z.number().int().min(1).max(100).default(20),
// v2: renamed to maxResults — keep old name for backward compatibility
max_results: z.number().int().min(1).max(100).optional()
.describe('DEPRECATED: use limit instead. Maximum number of results to return.'),
// New parameter — optional with default so old agents are unaffected
includeArchived: z.boolean().default(false)
.describe('Include archived records in results (default false)'),
},
async ({ query, limit, max_results, includeArchived }) => {
// Accept either name, prefer new one
const effectiveLimit = limit ?? max_results ?? 20;
const rows = await db.searchRecords(query, effectiveLimit, includeArchived);
return { content: [{ type: 'text', text: JSON.stringify(rows) }] };
}
);
Track how often the deprecated parameter arrives in calls. When it drops to zero for 30+ consecutive days, it is safe to remove. Use your audit logs to monitor:
-- Check if deprecated parameter is still being sent (last 30 days)
SELECT
DATE(timestamp) AS day,
COUNT(*) AS calls_with_deprecated_param
FROM audit_log
WHERE tool = 'search_records'
AND json_extract(args, '$.max_results') IS NOT NULL
AND timestamp > datetime('now', '-30 days')
GROUP BY day
ORDER BY day DESC;
Versioned tool names
When a breaking change is unavoidable, add a new versioned tool rather than modifying the existing one. The MCP protocol allows tools with any name — use a _v2 suffix to signal the version:
// Keep the original tool running — never remove it until adoption is confirmed
server.tool(
'create_order', // v1: amountCents as integer
'DEPRECATED: use create_order_v2. Create a new order.',
{
customerId: z.string(),
amountCents: z.number().int(),
},
async ({ customerId, amountCents }) => {
return orderService.create(customerId, { amountCents });
}
);
// New tool with improved schema — breaking change: amount is now a string decimal
server.tool(
'create_order_v2',
'Create a new order. amount is a decimal string (e.g. "49.00") to avoid floating-point precision issues.',
{
customerId: z.string(),
amount: z.string().regex(/^\d+\.\d{2}$/).describe('Decimal amount, two decimal places (e.g. "49.00")'),
currency: z.enum(['usd', 'eur', 'gbp']).default('usd'),
},
async ({ customerId, amount, currency }) => {
const amountCents = Math.round(parseFloat(amount) * 100);
return orderService.create(customerId, { amountCents, currency });
}
);
Update the create_order description to point to create_order_v2. LLMs read tool descriptions and will prefer the non-deprecated tool in new agent sessions. Old agents with cached system prompts continue using v1 without errors.
Migration timeline
A schema migration across a live agent fleet follows a 3-phase pattern:
- Phase 1 — Dual-write (ship day 0): Add the v2 tool. Keep v1 fully functional. Update v1 description to say "deprecated, use v2". Monitor which version agents are calling via audit logs.
- Phase 2 — Migration window (days 1–30): Communicate the deprecation to teams deploying agent systems. Confirm via audit logs that v2 call volume is growing and v1 is declining. Do not remove v1 until v1 call rate has been zero for ≥7 consecutive days.
- Phase 3 — Remove v1 (30+ days after phase 1): Delete the v1 tool registration. If any agents were still calling v1, they now receive a "tool not found" error, which is explicit and actionable — far better than a silent behavior change.
Never remove a tool in the same deploy as adding its replacement. Always ship the two-tool state first to give cached agent contexts time to expire.
Schema evolution and the MCP server version
Increment your server version on every breaking change (new required parameter, removed tool, renamed parameter) and on every new tool addition. Return the version in tool call responses via the serverVersion audit log field. This lets you correlate schema changes with behavioral shifts in your audit data:
{
"name": "my-server",
"version": "2.1.0" // bump minor for new tools/params, major for breaking changes
}
Pair version bumps with canary deployments — send 5% of traffic to the new schema for 24 hours. If the duplicate rate, error rate, or call pattern looks anomalous in the canary, roll back before the new schema reaches all agents.
Further reading
- MCP server versioning — semantic versioning and capability negotiation
- MCP server tool design — parameter naming, descriptions, and schema best practices
- MCP tool annotations — readOnlyHint, destructiveHint, and deprecation
- MCP server audit logging — tracking deprecated parameter usage
- MCP server canary deployments — gradual rollout for schema changes
- MCP server input validation — Zod schemas and boundary checks
- MCP server error handling — structured error types and recovery
- MCP server Zod validation — schema-first tool parameter validation
- AliveMCP — uptime monitoring for HTTP-deployed MCP servers