Guide · Architecture

MCP tool design

An MCP tool is not a REST endpoint and its primary consumer is not a human developer — it's an LLM reasoning about what to call and with what arguments. The tool's description string is a planning instruction read by the LLM before it decides whether to call the tool. The inputSchema field descriptions are guidance the LLM uses to choose argument values. The output structure determines whether the LLM can use the result in a subsequent tool call. Good MCP tool design is the difference between a server the LLM uses confidently and one it uses incorrectly, repeatedly, or not at all.

TL;DR

One tool per operation. Idempotent write tools (safe to retry). Verb-noun names (search_users, not getUsers). Tool descriptions as LLM instructions ("Use this when you need…", "Do not use this for…"). Field descriptions that state the expected format and give an example. Minimal required params with sensible defaults. Return structured data the LLM can reference in the next tool call. Require an explicit confirm: true for irreversible operations.

One tool, one responsibility

A tool that does two things forces the LLM to reason about which mode to invoke and increases the chance of a wrong call. A manage_users tool with a action: 'create' | 'update' | 'delete' parameter is harder for the LLM to use correctly than three separate tools: create_user, update_user, delete_user. The LLM selects from the tool list — many small tools with clear names are easier to select correctly than a few large tools with complex inputs.

PatternExampleLLM experience
Multi-action "god tool"manage_users(action, userId, ...)Must reason about the action enum before reasoning about other params; wrong action produces wrong effect
Separate focused toolssearch_users, get_user, update_user, delete_userTool name communicates intent; LLM selects correct tool without reasoning about an action field

Exception: tools with a very small related set of operations (read-only list + get) can sometimes be combined when the fields are disjoint and the description is clear. Use judgement — the test is whether the description string can state the tool's purpose in one unambiguous sentence.

Idempotency for safe retries

LLMs retry tool calls when they receive no result, a network timeout, or an ambiguous error. A non-idempotent write tool (one that performs the operation again on retry) produces duplicate records, duplicate charges, or duplicate emails. Design write tools to be idempotent when possible.

OperationNon-idempotentIdempotent
Create recordPOST creates a new record on every callAccept a client-generated idempotencyKey (UUID); server deduplicates by key
Send emailSends on every callAccept a messageId; server tracks sent IDs and skips duplicates
Update fieldIncrements a counter on every callSets the field to an absolute value instead of a delta
Delete recordErrors on second call (already deleted)Returns success (or a specific "already deleted" message) if the record doesn't exist
// Idempotent create with client-generated key
const CreateUserSchema = z.object({
  idempotencyKey: z.string().uuid().describe(
    'Client-generated UUID for deduplication. Generate a new UUID for each intended creation. ' +
    'Re-sending the same key returns the previously created user.'
  ),
  name:  z.string().min(1).describe('Full name of the user'),
  email: z.string().email().describe('Email address — must be unique'),
});

// In the handler:
const existing = await db.getByIdempotencyKey(parsed.data.idempotencyKey);
if (existing) return { content: [{ type: 'text', text: JSON.stringify(existing) }] };

const user = await db.createUser(parsed.data);
await db.setIdempotencyKey(parsed.data.idempotencyKey, user.id);
return { content: [{ type: 'text', text: JSON.stringify(user) }] };

Naming conventions

Tool names are the first thing the LLM reads when selecting a tool. Clear, consistent names reduce the chance of the wrong tool being called.

ConventionGoodAvoid
Verb-noun formatsearch_users, send_email, create_invoiceusers, emailSend, invoice_maker
snake_caseget_user_by_emailgetUserByEmail, get-user-by-email
No abbreviationslist_organisationslist_orgs, ls_org
Consistent resource namingcreate_user, update_user, delete_useradd_user, modify_user, remove_user (three different verbs for the same resource)
Read vs write is clearsearch_users (read), create_user (write)process_user (ambiguous)

Writing tool descriptions as LLM instructions

The tool description is not a human-readable doc comment — it is the instruction the LLM uses to decide whether to call this tool and in what context. Write it as a directive: state the purpose, the primary use case, and when not to use it (especially if there's a similar tool that might be confused).

// Weak description — tells the LLM nothing useful
{
  name: 'search_users',
  description: 'Search for users.',
}

// Strong description — LLM instructions
{
  name: 'search_users',
  description:
    'Search for users by name, email, or partial match. Returns a paginated list of matching users. ' +
    'Use this to find a user ID before calling get_user or update_user. ' +
    'Do not use this to check if a specific user exists — use get_user with the known ID instead.',
}

// Destructive tool — explicit warning
{
  name: 'delete_user',
  description:
    'Permanently delete a user account and all associated data. This operation is irreversible. ' +
    'Always confirm the userId with the user before calling this tool. ' +
    'Requires confirm: true to prevent accidental deletion.',
}

"Do not use this for…" instructions are especially valuable when two tools have overlapping capabilities. The LLM reads all tool descriptions before selecting — explicit disambiguation prevents wrong-tool calls.

Field descriptions as LLM guidance

The description in each inputSchema property is displayed by MCP Inspector and read by LLM clients when choosing argument values. Write each as an instruction that tells the LLM exactly what value to provide.

const SearchUsersSchema = z.object({
  query: z.string().min(1).describe(
    'Search term to match against user name and email. Supports partial matches. ' +
    'Example: "alice" matches "Alice Smith" and "alice@example.com".'
  ),
  page: z.number().int().positive().default(1).describe(
    '1-based page number. First page is 1, not 0.'
  ),
  pageSize: z.number().int().min(1).max(100).default(20).describe(
    'Number of results per page. Default 20. Maximum 100.'
  ),
  status: z.enum(['active', 'suspended', 'deleted']).optional().describe(
    'Filter by account status. Omit to return all statuses.'
  ),
});

Callouts worth including in descriptions: 1-based vs 0-based pagination (LLMs often guess 0), enum values with brief explanations, expected format for dates or IDs, and what happens if a field is omitted.

Minimal required parameters

Every required parameter is a decision the LLM must make correctly or the tool call fails. Prefer optional parameters with sensible defaults over required parameters with obvious defaults.

ApproachExampleLLM experience
Required params onlysearch_users(query, page, pageSize, sortBy, sortOrder)LLM must choose 5 values correctly; wrong choice returns validation error
Minimal required + defaultssearch_users(query, page=1, pageSize=20, sortBy='createdAt', sortOrder='desc')LLM only needs to provide query; defaults handle the rest for the common case

Required parameters should be things the LLM cannot reasonably guess or default: the search query, the resource ID to act on, the content to create. Configuration (pagination, sorting, filtering) should have defaults that cover the most common case.

Output structure for downstream tool calls

Tool results feed directly into subsequent tool calls in an agent loop. If a tool returns an unstructured prose string, the LLM must parse it to extract the IDs or values needed for the next call. Return structured data that the LLM can reference directly.

// Unstructured — LLM must parse to find the user ID for the next call
return {
  content: [{
    type: 'text',
    text: 'Found 3 users: Alice Smith (alice@example.com), Bob Jones...',
  }],
};

// Structured — user IDs are immediately accessible for downstream calls
return {
  content: [{
    type: 'text',
    text: JSON.stringify({
      users: [
        { id: 'usr_abc123', name: 'Alice Smith', email: 'alice@example.com' },
        { id: 'usr_def456', name: 'Bob Jones',   email: 'bob@example.com' },
      ],
      total: 3,
      page: 1,
      pageSize: 20,
    }),
  }],
};

The LLM can reference users[0].id in the next tool call (update_user, delete_user) without re-querying. Include pagination metadata so the LLM knows whether to request subsequent pages.

Confirmation for irreversible operations

Destructive operations (delete, archive, send, publish) should require explicit confirmation from the LLM to prevent accidental invocation. A confirm: true required field forces the LLM to reason about whether it wants to proceed before committing.

const DeleteUserSchema = z.object({
  userId:  z.string().uuid().describe('UUID of the user to delete'),
  confirm: z.literal(true).describe(
    'Must be exactly true to confirm permanent deletion. ' +
    'Verify with the user before setting this field.'
  ),
});

// In the handler: the schema validation already ensures confirm === true
// The LLM cannot call this tool without explicitly passing confirm: true

Requiring z.literal(true) (not z.boolean()) means the field must be true, not just any boolean. The description instructs the LLM to verify with the user first — a prompt-injection safeguard for tools that could be destructively triggered by malicious external content.

Tool granularity decisions

The right number of tools is not always "one per database operation." Overly fine-grained tools increase the number of round trips an agent needs; overly coarse tools are confusing and error-prone.

Too fine-grainedBetterWhy
get_user_name, get_user_email, get_user_statusget_user (returns all fields)Multiple round trips for a single user profile; LLM makes 3 calls when 1 suffices
list_users_page_1, list_users_page_2search_users(page: number)Tools should be parameterised, not duplicated
Too coarseBetterWhy
manage_users(action, ...)create_user, update_user, delete_userAction enum forces the LLM to reason about the action before the arguments; separate tools are unambiguous
do_everything(task)Focused operation-specific toolsFreeform text arguments make the tool's contract undefined; LLM cannot predict what the tool will do

Backward-compatible tool evolution

LLM clients cache tool lists and system prompts. Renaming or removing a tool breaks existing agent prompts that reference the old name. Changing a required parameter to optional is safe; changing an optional to required is not.

See MCP server versioning for strategies when breaking changes are necessary.

Tool design checklist

CheckTest
Does each tool do one thing?Can you state its purpose in one unambiguous sentence?
Are write tools idempotent?Does calling the same tool twice produce the same result?
Is the name verb-noun and clear?Would a new developer understand what it does from the name alone?
Does the description tell the LLM when to use and not use this tool?Does it mention similar tools and when to prefer them?
Do field descriptions include format, example, and edge cases?Especially: 1-based vs 0-based, optional vs required, enum values
Are required params truly required?Could any required param have a sensible default?
Does the output include IDs needed for next tool calls?Can the LLM chain this with update/delete without re-querying?
Do destructive tools require confirmation?Is confirm: true required for delete/archive/send operations?
Are input validation errors LLM-readable?Does the error message tell the LLM what to change?

What tool design cannot guarantee

Even a perfectly designed tool can fail to run reliably in production. Network outages, database connection exhaustion, deployment failures, and infrastructure misconfiguration are invisible to the tool's design. An LLM that calls a well-designed tool and receives no response — because the server is down — has no way to distinguish "tool design problem" from "server unavailable." AliveMCP probes your live MCP server's initialize and tools/list endpoints every 60 seconds, so tool availability failures are detected before LLM clients encounter them.

Related questions

How many tools should an MCP server expose?

There's no hard limit, but LLM context windows are finite and large tool lists consume tokens. A server with 50 tools is unlikely to have all 50 used in a single agent session — consider grouping related operations into focused servers rather than one mega-server. Practical guidelines: keep each server focused on a single domain (user management, email, calendar); keep tool descriptions concise (they're all sent to the LLM upfront); if a server has more than 20 tools, consider whether it should be two servers.

Should tool outputs be prose or structured data?

Structured data (JSON) is generally better for tool results that the LLM will use in downstream calls — the LLM can reference specific fields by key. Prose is better for final-answer tools where the LLM will summarise the result for the user. Many tools should return both: a structured data object for machine use and a brief prose summary. Put the structured JSON in one TextContent item and the summary in another, so the LLM can choose which to surface.

When should a tool return multiple content items?

Use multiple content items when the result has distinct parts: a structured data JSON for downstream use, plus a human-readable summary; or an image result plus a text description. Each content item in the array is a distinct piece the LLM can reference. Avoid more than 3-4 items — LLMs process the content array sequentially and long content arrays increase token cost without proportional benefit.

How do I handle tools that take a long time (>30 seconds)?

Long-running tools should use a job pattern: the tool initiates the operation and returns a job ID immediately; a separate get_job_status tool polls for completion. This prevents transport timeouts, lets the LLM do other work while waiting, and allows the user to cancel. The initial tool's description should explain that it starts an async job and that get_job_status(jobId) must be called to retrieve the result.

Further reading