Guide · MCP Protocol Primitives

MCP Server Prompts — templated interactions, dynamic arguments, and reusable prompt patterns

MCP prompts are the third primitive alongside tools and resources. A prompt is a server-defined message template that clients can discover, expand with arguments, and inject into the conversation context. Where a tool performs an action and a resource exposes data for reading, a prompt scaffolds the interaction itself. A code review server might expose a review_pull_request prompt that accepts a PR number and returns a structured multi-turn conversation seeded with the diff, the review criteria, and an opening user turn asking for the review. Clients that support prompt injection surface them in slash-command menus, quick-action panels, or IDE context menus. This guide covers prompt definition, the ListPrompts and GetPrompt handlers, dynamic argument expansion, multi-turn message sequences, embedded resource content, and health monitoring for the prompt layer.

TL;DR

Define prompts with server.prompt(), specifying a name, description, and argument schema. Implement the GetPrompt handler to expand arguments into a messages array of user and assistant turns. Return embedded resource content alongside text messages when the prompt should pre-load data into context. Emit notifications/prompts/list_changed when your prompt catalog changes. Monitor your prompt layer with a /health/prompts endpoint that verifies argument-expansion works end-to-end — a broken data dependency inside a prompt template is invisible until a client tries to expand it.

What prompts are for

The canonical use cases for MCP prompts:

Use case Why a prompt, not a tool
Code review with context pre-loaded Seeds the conversation with the diff + review criteria; the LLM produces the review in the completion, not via a tool call
Incident analysis starting point Fetches log snippets, alert metadata, and runbook as embedded resources; the LLM begins analysis immediately
Customer support context scaffold Loads customer history, open tickets, and KB articles before the agent handles the first message
Document summarisation template Wraps the document URI in a structured turn with a specific summarisation instruction; reusable across all documents on the server
Multi-step interview flow Pre-seeds multiple assistant turns to guide the LLM through a structured discovery process with the user

If the action produces a side effect (creates a ticket, sends an email, writes to a database), use a tool. If the action populates context for a conversation that then uses tools, use a prompt.

Defining prompts with the SDK

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { z } from 'zod';

const server = new Server(
  { name: 'my-mcp-server', version: '1.0.0' },
  { capabilities: { prompts: { listChanged: true } } }
);

// Simple prompt with one required argument
server.prompt(
  'summarise_document',
  {
    description: 'Summarise a document by its URI into key points and action items',
    argsSchema: z.object({
      document_uri: z.string().describe('The URI of the document to summarise'),
      detail_level: z.enum(['brief', 'detailed']).optional()
        .describe('Summary verbosity — brief (3-5 bullets) or detailed (full section breakdown). Default: brief.')
    })
  },
  async ({ document_uri, detail_level = 'brief' }) => {
    const instruction = detail_level === 'detailed'
      ? 'Provide a detailed section-by-section breakdown with key decisions, open questions, and action items.'
      : 'Provide a brief summary of 3-5 bullet points covering the main points and any action items.';

    return {
      description: `Summarise ${document_uri}`,
      messages: [
        {
          role: 'user',
          content: {
            type: 'resource',
            resource: {
              uri: document_uri,
              mimeType: 'text/plain',
              text: await fetchDocumentText(document_uri)
            }
          }
        },
        {
          role: 'user',
          content: {
            type: 'text',
            text: instruction
          }
        }
      ]
    };
  }
);

The argsSchema is a Zod schema. The SDK validates incoming arguments against it before calling your handler, so your handler can assume valid types. Required fields throw a validation error if absent; optional fields receive undefined.

Implementing ListPrompts

import { ListPromptsRequestSchema } from '@modelcontextprotocol/sdk/types.js';

server.setRequestHandler(ListPromptsRequestSchema, async () => {
  return {
    prompts: [
      {
        name: 'summarise_document',
        description: 'Summarise a document by URI into key points and action items',
        arguments: [
          {
            name: 'document_uri',
            description: 'URI of the document to summarise',
            required: true
          },
          {
            name: 'detail_level',
            description: 'brief or detailed. Default: brief.',
            required: false
          }
        ]
      },
      {
        name: 'review_pull_request',
        description: 'Seed a code review conversation with diff, review criteria, and CI status',
        arguments: [
          {
            name: 'pr_number',
            description: 'GitHub pull request number',
            required: true
          },
          {
            name: 'focus',
            description: 'Review focus: security, performance, style, or all. Default: all.',
            required: false
          }
        ]
      }
    ]
  };
});

Clients render the description and argument description strings in their prompt browser. Write them as a developer would read them — what does this prompt do, when should I use it, what argument values are valid. The argument description is the only documentation the user sees before invoking the prompt.

Multi-turn prompts and conversation seeding

A prompt's messages array can contain multiple user and assistant turns. This seeds the conversation with context before the user's next message arrives. Use this pattern to front-load expensive data fetches (so the LLM starts with the data already in context) and to guide the LLM toward a specific response format via a scaffolded prior assistant turn:

server.prompt(
  'review_pull_request',
  {
    description: 'Seed a code review conversation with diff, review criteria, and CI status',
    argsSchema: z.object({
      pr_number: z.string(),
      focus: z.enum(['security', 'performance', 'style', 'all']).optional()
    })
  },
  async ({ pr_number, focus = 'all' }) => {
    const [pr, diff, ciStatus] = await Promise.all([
      github.pulls.get({ owner, repo, pull_number: Number(pr_number) }),
      github.pulls.listFiles({ owner, repo, pull_number: Number(pr_number) }),
      github.checks.listForRef({ owner, repo, ref: pr.data.head.sha })
    ]);

    const focusInstruction = {
      security: 'Focus on security: injection risks, credential handling, access control gaps.',
      performance: 'Focus on performance: N+1 queries, large allocations, blocking I/O in hot paths.',
      style: 'Focus on readability: naming, comment quality, complexity, test coverage.',
      all: 'Cover security, performance, and style. Flag critical issues before minor ones.'
    }[focus];

    return {
      description: `Code review for PR #${pr_number}: ${pr.data.title}`,
      messages: [
        // Turn 1: seed context with the PR diff as a resource
        {
          role: 'user',
          content: {
            type: 'resource',
            resource: {
              uri: `https://github.com/${owner}/${repo}/pull/${pr_number}.diff`,
              mimeType: 'text/x-diff',
              text: diff.data.map(f => `--- ${f.filename}\n${f.patch}`).join('\n\n')
            }
          }
        },
        // Turn 2: the review request
        {
          role: 'user',
          content: {
            type: 'text',
            text: [
              `PR #${pr_number}: ${pr.data.title}`,
              `CI status: ${ciStatus.data.check_runs.map(r => `${r.name}: ${r.conclusion}`).join(', ')}`,
              '',
              focusInstruction
            ].join('\n')
          }
        }
      ]
    };
  }
);

Embedded resources in prompt messages

Prompt messages support three content types for embedding data alongside text:

Content type Use when Fields
text Plain instruction or context string text: string
image Screenshot, diagram, chart data: base64, mimeType
resource Structured data that matches a known resource URI resource.uri, resource.text or resource.blob

The resource content type signals to clients that the embedded content corresponds to a resource the server exposes. Clients that implement resource browsers may cross-link — clicking the embedded resource opens it in their resource panel. For non-resource data that isn't tracked as a URI on your server, use text content with the data inlined as a string.

Prompt list change notifications

If your prompt catalog changes at runtime — prompts added or removed based on server state or feature flags — emit a notifications/prompts/list_changed notification. Clients re-issue a ListPrompts request when they receive this, keeping their prompt browser current:

// When a feature flag enables a new prompt template
featureFlags.on('change', async (flagName, enabled) => {
  if (FEATURE_FLAG_TO_PROMPT_MAP[flagName]) {
    if (enabled) {
      registerPrompt(FEATURE_FLAG_TO_PROMPT_MAP[flagName]);
    } else {
      unregisterPrompt(FEATURE_FLAG_TO_PROMPT_MAP[flagName]);
    }
    await server.sendPromptListChanged();
  }
});

Health monitoring for the prompt layer

A prompt that expands successfully is more than a handler that doesn't throw — it requires all its data dependencies to be reachable. A prompt that fetches a PR diff from GitHub will fail silently if the GitHub API is down. Expose a /health/prompts endpoint that runs smoke expansions against your most critical prompts:

app.get('/health/prompts', async (req, res) => {
  const checks: Record<string, 'ok' | 'degraded' | 'down'> = {};

  // Smoke-expand each critical prompt with a known-good argument set
  const smokeTests = [
    {
      name: 'summarise_document',
      args: { document_uri: 'config://mcp/server', detail_level: 'brief' }
    },
    {
      name: 'review_pull_request',
      args: { pr_number: String(KNOWN_TEST_PR_NUMBER) }
    }
  ];

  for (const test of smokeTests) {
    try {
      const handler = promptHandlers.get(test.name);
      if (!handler) { checks[test.name] = 'down'; continue; }
      const result = await Promise.race([
        handler(test.args),
        new Promise((_, reject) => setTimeout(() => reject(new Error('timeout')), 5000))
      ]);
      checks[test.name] = result.messages.length > 0 ? 'ok' : 'degraded';
    } catch {
      checks[test.name] = 'down';
    }
  }

  const overall = Object.values(checks).includes('down') ? 'down'
    : Object.values(checks).includes('degraded') ? 'degraded'
    : 'ok';

  res.status(overall === 'down' ? 503 : 200).json({ status: overall, checks });
});

Wire AliveMCP to /health/prompts. A prompt handler that depends on GitHub, a database, or an external API will degrade when those dependencies do — but the MCP session layer stays healthy, so there is no protocol-level signal that anything is wrong until a user tries to expand a prompt and gets an error.

Frequently asked questions

When should I use a prompt instead of a system prompt in the client?

Use a server-side MCP prompt when the template content is dynamic (depends on server-side data like a PR diff, customer history, or document content), when you want the template reusable across multiple clients without re-implementing it in each, or when you want to surface the template in the client's prompt browser UI. Use a static system prompt in the client when the instructions are static and client-specific. The distinction is dynamic vs static content, and single-client vs multi-client reuse.

Can I use sampling inside a prompt handler?

Yes — the MCP sampling API lets your server request an LLM completion from the client, which you can use inside a prompt handler to classify input, generate a prompt variation, or summarise a large document before embedding the summary in the prompt's messages. Be careful about latency: a prompt handler that makes a sampling call adds a full LLM round-trip to the expansion time. Use it for value-adding classification, not for simple string formatting that could be done inline.

How do I handle prompts that require very long context?

The prompt's messages array is injected into the client's conversation context. If you embed a large document, the total token count may exceed the context window. Strategies: (1) use the detail_level argument to let users choose a truncated version; (2) summarise the document with a sampling call before embedding; (3) embed only the most relevant sections, with a fetch_section tool the LLM can call to get more; (4) implement context window management by token-counting the embedded content and truncating at a safe budget. The LLM should never receive a prompt that silently exceeds the context window — truncation should be explicit and communicated in a text message within the prompt.

Do prompts work with all MCP clients?

Prompts require client support for the prompts capability. Clients that declare { prompts: {} } in their initialize response support ListPrompts and GetPrompt. Check clientCapabilities.prompts in your server's session context before emitting notifications/prompts/list_changed — sending notifications to clients that haven't declared the capability wastes a round-trip and may cause errors in strict clients. See MCP Capabilities Negotiation for the full handshake pattern.

Further reading

Know when your prompt dependencies go down

AliveMCP monitors your /health/prompts endpoint every 60 seconds, catching GitHub API outages, database failures, and broken templates before users hit expansion errors.

Start monitoring free