Guide · Alert Routing
Discord Alerts for MCP Server Downtime — webhook routing guide
Indie MCP server authors live in Discord. Their community is there, their users are there, and their co-contributors are there. Routing AliveMCP downtime alerts to Discord keeps the monitoring signal in the same space where the response happens — no context switch to a separate ops tool. This guide covers the Discord webhook payload format for MCP server incidents, how to use embed colors to communicate alert state at a glance, role pinging for critical failures, thread-based incident tracking, and the message-edit strategy that prevents alert flood from minute-by-minute re-checks.
TL;DR
Create a Discord webhook URL in your server's Integrations settings. Build a bridge that POSTs a embeds payload with a red embed for alert.triggered and a green embed for alert.resolved. Store the message_id returned by Discord when you post the initial alert — when AliveMCP sends update events (still down after N minutes) and the resolve event, PATCH the same message rather than posting a new one. Ping a @mcp-oncall role on the initial trigger; remove the ping on updates to prevent repeated pinging. For sustained outages, create a Discord thread on the alert message and post duration updates there to keep the main channel clean.
Why Discord for MCP server alerting
Discord is the de-facto community platform for the MCP ecosystem. The official MCP Discord has thousands of members. The r/ClaudeAI and r/MCP communities cross-post to Discord servers. If you have published an MCP server, there is a reasonable chance your early adopters are on a Discord server where you are also present.
This creates a monitoring advantage that PagerDuty and OpsGenie do not have: proximity to users. When your server goes down and AliveMCP fires an alert to your #ops-alerts channel, users on your community Discord who encounter issues can be pointed to that channel thread for status updates. You can respond to both the alert and the community inquiry in the same tool. The alert becomes a live status post, not a private ops notification.
| Use case | Discord best fit? | Why / why not |
|---|---|---|
| Solo indie author, side-project MCP server | Yes | Already in Discord, no extra tool, community-visible ops |
| Small dev team (2–5), no formal on-call | Yes | Role ping is sufficient escalation; team is responsive |
| Production service, SLA commitments | Partial | Use Discord for visibility + PagerDuty for guaranteed wake-up |
| Enterprise team with on-call rotation | No | Discord DND does not guarantee notification; use PagerDuty/OpsGenie |
The key limitation of Discord for production alerting: Discord respects Do Not Disturb mode on mobile. A phone on DND will not vibrate for Discord notifications. PagerDuty and OpsGenie bypass DND using high-priority notification channels on iOS and Android. For MCP servers where downtime during sleeping hours is a critical business issue, layer Discord over a dedicated on-call tool rather than using it as the sole alerting channel.
Creating a Discord webhook
In your Discord server, go to Server Settings → Integrations → Webhooks → New Webhook. Name it "AliveMCP Alerts" and assign it to your #ops-alerts channel (create the channel first if needed). Copy the webhook URL — it looks like https://discord.com/api/webhooks/{id}/{token}. This URL is a secret: anyone with it can post to your channel. Store it in an environment variable, never in source code.
Consider creating a separate webhook for different severity tiers if your team monitors many servers:
#ops-critical— production MCP servers, immediate response expected#ops-dependency— third-party MCP servers you depend on#ops-log— all events including resolved and brief blips
Embed format for MCP server alerts
Discord embeds are the structured message format that adds color, fields, and metadata to a message. For MCP server alerts, the embed color conveys status at a glance: red for down, green for recovered, orange for degraded/warning.
// discord-alert-bridge.js
const DISCORD_WEBHOOK_URL = process.env.DISCORD_WEBHOOK_URL;
const messageIdStore = new Map(); // in production: persist in Redis or DB
async function handleAliveMCPWebhook(req, res) {
const event = req.body;
const { type, server_slug, server_name, failure_reason, check_url, downtime_seconds } = event;
const storeKey = `discord-msg-${server_slug}`;
if (type === 'alert.triggered') {
const existingMsgId = messageIdStore.get(storeKey);
const embed = {
title: `🔴 MCP server down: ${server_name}`,
description: failure_reason,
color: 0xE53E3E, // red
fields: [
{ name: 'Endpoint', value: `\`${check_url}\``, inline: true },
{ name: 'Status', value: 'DOWN', inline: true },
{ name: 'Dashboard', value: `[View on AliveMCP](https://alivemcp.com/status/${server_slug})`, inline: true },
],
timestamp: new Date().toISOString(),
footer: { text: 'AliveMCP · MCP endpoint monitoring' },
};
if (existingMsgId) {
// Server is still down — update the existing message, don't spam
await discordPatch(existingMsgId, { embeds: [embed] });
} else {
// First alert — post new message, ping the on-call role
const resp = await discordPost({
content: '<@&YOUR_ONCALL_ROLE_ID> MCP server alert',
embeds: [embed],
});
const data = await resp.json();
messageIdStore.set(storeKey, data.id);
}
}
if (type === 'alert.resolved') {
const msgId = messageIdStore.get(storeKey);
const embed = {
title: `✅ MCP server recovered: ${server_name}`,
description: `Downtime: ${Math.ceil(downtime_seconds / 60)} minutes`,
color: 0x38A169, // green
fields: [
{ name: 'Endpoint', value: `\`${check_url}\``, inline: true },
{ name: 'Status', value: 'UP', inline: true },
{ name: 'Dashboard', value: `[View on AliveMCP](https://alivemcp.com/status/${server_slug})`, inline: true },
],
timestamp: new Date().toISOString(),
footer: { text: 'AliveMCP · MCP endpoint monitoring' },
};
if (msgId) {
await discordPatch(msgId, { content: '', embeds: [embed] }); // remove ping from resolved message
messageIdStore.delete(storeKey);
} else {
await discordPost({ embeds: [embed] }); // no prior message to update
}
}
res.status(200).json({ ok: true });
}
async function discordPost(body) {
return fetch(`${DISCORD_WEBHOOK_URL}?wait=true`, { // ?wait=true returns the message object
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body),
});
}
async function discordPatch(messageId, body) {
return fetch(`${DISCORD_WEBHOOK_URL}/messages/${messageId}`, {
method: 'PATCH',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body),
});
}
Three critical details in this bridge:
?wait=trueon the initial POST. By default Discord webhooks return an empty 204 response. Adding?wait=truemakes Discord return the full message object including theidyou need for subsequent PATCH calls.- PATCH the existing message, don't POST new ones. AliveMCP sends
alert.triggeredevery minute while the server is down. Without message deduplication, your channel would receive a new alert message every minute — 30 messages for a 30-minute outage. The PATCH approach updates the timestamp and failure reason on the existing message, creating a single running "incident ticket" in the channel. - Remove the role ping on PATCH. The role ping (
<@&ROLE_ID>) in thecontentfield is only needed on the initial alert to notify the on-call person. On subsequent updates, setcontent: ''to avoid re-pinging every minute.
Thread-based incident tracking
For outages lasting more than 5 minutes, create a Discord thread on the alert message to keep status updates out of the main channel. The main channel message becomes the incident summary; the thread contains the timeline of updates.
// Create a thread on the alert message when downtime exceeds 5 minutes
async function createIncidentThread(messageId, serverName) {
const resp = await fetch(
`https://discord.com/api/v10/channels/${CHANNEL_ID}/messages/${messageId}/threads`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bot ${DISCORD_BOT_TOKEN}`, // requires bot, not just webhook
},
body: JSON.stringify({
name: `Incident: ${serverName} — ${new Date().toISOString().slice(0, 10)}`,
auto_archive_duration: 1440, // archive after 24h of inactivity
}),
}
);
return (await resp.json()).id; // thread channel ID
}
async function postThreadUpdate(threadId, message) {
await fetch(`https://discord.com/api/v10/channels/${threadId}/messages`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bot ${DISCORD_BOT_TOKEN}`,
},
body: JSON.stringify({ content: message }),
});
}
Thread creation requires a Discord bot token (not just a webhook URL) because webhooks cannot create threads via the webhook API directly. If you prefer not to create a bot, post thread updates as regular messages in a dedicated #incident-log channel and reference the alert message with a message link.
Role setup for MCP server on-call
Create a Discord role named @mcp-oncall and assign it to yourself and any team members who should be paged for MCP server incidents. In Discord's notification settings for the server, set this role to always notify (bypass user's notification settings for server-level). On mobile, add the Discord server to your notification exceptions so role pings wake you up even when the phone is on DND — but note this is less reliable than PagerDuty's guaranteed phone call.
For a team with rotation, manually update the @mcp-oncall role membership at the start of each shift. If your team uses Google Calendar for on-call rotation, you can automate role membership updates with a small script that reads the calendar event and calls the Discord API to add/remove the role.
Layering Discord with PagerDuty
The most robust MCP server alerting setup combines Discord for visibility with PagerDuty for guaranteed escalation. The pattern:
- AliveMCP fires the webhook to your bridge.
- Your bridge simultaneously posts to Discord (for team awareness and community communication) AND calls PagerDuty Events API (for guaranteed on-call notification).
- The on-call person acknowledges in PagerDuty (which stops the escalation) and posts a status update in the Discord incident thread (which keeps the community informed).
- When the server recovers, AliveMCP sends
alert.resolved, your bridge updates both the Discord message (green embed) and resolves the PagerDuty incident.
This pattern is standard practice for teams that have community MCP servers: Discord for the public-facing status thread, PagerDuty for the private on-call page. The Discord thread also serves as the public incident timeline if users ask "what happened?"
Frequently asked questions
Do I need a Discord bot or is a webhook URL sufficient?
A webhook URL is sufficient for the basic alert posting and message editing flow. You only need a Discord bot (with a bot token) if you want to create threads programmatically, manage role membership automatically, or interact with messages in ways the webhook API doesn't support (like reacting with emoji to acknowledge an alert). For most indie MCP authors, a webhook URL is enough: post the alert, edit it on updates, edit again on resolve. If you want thread creation, consider using the Discord.js library with a bot account, or accept the simpler workaround of posting thread updates as messages in a separate channel.
How do I store the Discord message ID between AliveMCP webhook calls?
In the code above, messageIdStore is an in-memory Map — fine for a single-process server but lost on restart. For a production bridge, store the message ID in a persistent store keyed by server_slug. If you are running the bridge as a Cloudflare Worker, use Cloudflare KV. If it is a Node.js process, use a SQLite file or Redis. If it is a serverless function (Vercel, AWS Lambda), use DynamoDB or an edge KV store. The message ID record should expire after 24 hours — if no alert.resolved event arrives within 24 hours, the server has been down for a very long time and the original Discord message has likely scrolled out of context anyway; the next trigger event should create a fresh message.
What embed color scheme works best for MCP server alert states?
Use the traffic light convention: red (0xE53E3E) for down, green (0x38A169) for recovered, orange (0xED8936) for degraded or elevated error rate (if AliveMCP supports that signal), and blue (0x4299E1) for maintenance window or scheduled downtime. The color appears as the left sidebar stripe on the embed. The title emoji reinforces the state for users on monochrome displays or accessibility settings: 🔴 for down, ✅ for recovered, ⚠️ for degraded, 🔧 for maintenance. Both the color and emoji together create an unambiguous visual signal that can be parsed at a glance when scanning a channel with many messages.
Can I use Discord Scheduled Events to announce planned maintenance windows?
Yes, and it is an underused feature for MCP server ops. Create a Discord Scheduled Event for the maintenance window — it appears in the server's event list and members can RSVP to get notified when it starts. In the event description, include the MCP servers that will be unavailable, the expected duration, and a link to the AliveMCP status page for live updates. When the maintenance window starts, AliveMCP suppresses alerts (if configured), the scheduled event starts, and members who RSVPed get a push notification. This is particularly useful for community-facing MCP servers where your users are also on the Discord server — they see the maintenance announcement in the events list before it happens rather than being surprised by the downtime alert.
How many Discord channels should I dedicate to MCP server alerts?
For a solo author with 1–5 MCP servers, a single #mcp-alerts channel is sufficient. For a team with 5–20 servers across different projects, separate channels by severity: #ops-critical (production servers, pinged on any event), #ops-dependency (third-party MCP servers, no pings), and #ops-recovered (all recovery events, no pings — useful for daily review). More than three channels is usually noise overhead; resist the temptation to create per-server channels until you have more than 20 servers actively monitored. The message-edit deduplication strategy keeps a single channel clean even with high server counts — each server has at most one pinned active alert message at a time.
Further reading
- PagerDuty for MCP Servers — guaranteed on-call escalation
- OpsGenie for MCP Servers — team-based routing and scheduling
- MCP Server Alert Routing Architecture — multi-channel and deduplication design
- MCP Server Slack Alerts — channel routing and message formatting
- MCP Server Webhook Alerts — securing and verifying outbound webhooks
- MCP Server Incident Runbook — response playbook for common failures