Guide · CI/CD
MCP server CI/CD
An MCP server CI/CD pipeline needs three gates that don't exist for ordinary HTTP services: a protocol compliance test (verifying initialize response shape), a schema snapshot gate (failing if tools/list changes unexpectedly), and a post-deploy probe check (running the same initialize → tools/list sequence that your monitoring runs continuously). Skip any of these and you're shipping blind — protocol regressions reach production silently, schema changes break AI clients that cached the tool list, and deploy-phase failures go unnoticed until a user reports them.
TL;DR
Wire three gates into your CI/CD pipeline: (1) run protocol compliance and schema snapshot tests on every push; (2) block merge if the schema hash changed without a new baseline file; (3) after every production deploy, wait for AliveMCP (or a manual probe) to confirm the initialize → tools/list sequence passes before marking the deploy successful. If the post-deploy probe fails within five minutes, trigger an automatic rollback. The schema snapshot file and the monitoring probe are both derived from the same tools/list call — keep them in sync.
Pipeline structure
A complete MCP server CI/CD pipeline has four stages that run sequentially: build, test, deploy, and verify. The test stage is where MCP-specific gates live. The verify stage is new compared with traditional HTTP services — it connects to the freshly deployed server and confirms the MCP session lifecycle completes before the deploy is declared successful.
name: MCP server CI/CD
on:
push:
branches: [main]
pull_request:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '22' }
- run: npm ci
- run: npm run build
test:
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '22' }
- run: npm ci && npm run build
- name: Start server
run: node dist/index.js &
env:
PORT: 3001
NODE_ENV: test
- name: Wait for initialize probe
run: |
for i in $(seq 1 20); do
curl -sf -X POST http://localhost:3001/mcp \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"ci","version":"1"}}}' \
| grep -q protocolVersion && break
sleep 1
done
- run: npm test # runs compliance + snapshot + integration
deploy:
needs: test
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: flyctl deploy --remote-only
env:
FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}
verify:
needs: deploy
runs-on: ubuntu-latest
steps:
- name: Post-deploy probe
run: |
for i in $(seq 1 30); do
curl -sf -X POST https://your-app.fly.dev/mcp \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"deploy-verify","version":"1"}}}' \
| grep -q protocolVersion && echo "Deploy verified" && exit 0
echo "Attempt $i failed, retrying..."
sleep 10
done
echo "Post-deploy verification failed — rolling back"
flyctl releases rollback --app your-app
exit 1
env:
FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}
The verify job re-runs the initialize probe against the production URL with a 5-minute window (30 attempts × 10-second sleep). If the probe doesn't pass within that window, it triggers an automatic rollback using flyctl releases rollback. This catches the most common deploy failure: the server starts but can't complete the MCP handshake due to a missing environment variable, a failed database connection, or a broken dependency load. See MCP server deployment for the full post-deploy verification checklist.
Schema snapshot gate
The schema snapshot gate prevents unreviewed tools/list changes from reaching production. AI clients that call your server often cache the tool schema after the first tools/list call. If you add, remove, or rename a tool without the client knowing, the client's cached schema is stale — it may call a tool that no longer exists or miss a new tool entirely. See schema drift in MCP tool definitions for why this causes silent breakage.
The gate works by committing a baseline hash file alongside the code. CI computes the current hash and compares it. If it differs, the test fails with a message prompting the developer to review and commit a new baseline:
// test/schema-snapshot.test.js
import { createHash } from 'node:crypto';
import { readFileSync, writeFileSync, existsSync } from 'node:fs';
import { getToolsViaSession } from './helpers.js';
const SNAPSHOT_PATH = './test/schema-snapshot.json';
it('tools/list matches committed snapshot', async () => {
const tools = await getToolsViaSession('http://localhost:3001/mcp');
const sorted = tools
.sort((a, b) => a.name.localeCompare(b.name))
.map(t => ({ name: t.name, description: t.description, inputSchema: t.inputSchema }));
const hash = createHash('sha256').update(JSON.stringify(sorted)).digest('hex');
if (!existsSync(SNAPSHOT_PATH)) {
writeFileSync(SNAPSHOT_PATH, JSON.stringify({ hash, tools: sorted }, null, 2));
console.log('Snapshot created. Review and commit test/schema-snapshot.json.');
return;
}
const { hash: baseline } = JSON.parse(readFileSync(SNAPSHOT_PATH, 'utf8'));
if (hash !== baseline) {
throw new Error(
'Tool schema changed. If intentional, delete test/schema-snapshot.json, ' +
're-run tests to create a new snapshot, review the diff, then commit.'
);
}
});
Commit test/schema-snapshot.json to your repository. The snapshot file serves two purposes: CI gate (automated) and change log (human). Every time the schema changes intentionally, there's a commit showing exactly what changed and who approved it. This is the same information your monitoring dashboard shows when it detects live schema drift — the CI snapshot is the pre-production layer, AliveMCP is the post-deployment continuous layer.
Environment variable injection in CI/CD
MCP servers typically need secrets (API keys, database URLs, signing keys) that must not appear in source code or CI logs. The right pattern is to inject secrets via the CI platform's secret store and map them to the deployment platform's secret store — never passing them through the pipeline as plain-text values.
# GitHub Actions: inject secrets into deployment
- name: Set secrets on Fly.io
run: |
flyctl secrets set \
OPENAI_API_KEY="${{ secrets.OPENAI_API_KEY }}" \
DATABASE_URL="${{ secrets.DATABASE_URL }}" \
MCP_SIGNING_KEY="${{ secrets.MCP_SIGNING_KEY }}" \
--app your-app
env:
FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}
# The deploy step then picks up the secrets from Fly.io's secret store —
# they're injected at container start, not baked into the image.
Three principles for CI/CD secret hygiene: (1) never pass secrets as CLI arguments (they appear in ps aux output and CI logs); (2) the CI runner should only have permission to write secrets to the deployment platform, not read them back; (3) rotate secrets on schedule, not only when suspected of exposure. See MCP server environment variables for a full breakdown of configuration patterns by platform.
Branch strategy and deploy gates
A simple but effective branch strategy for MCP server development:
- Feature branches — run build + test (protocol compliance + schema snapshot + integration). Deploy to a preview environment if you have one.
- main — run build + test + deploy to staging + probe staging. If staging probe passes, deploy to production + probe production.
- Schema changes — require a new committed snapshot file before the PR can merge. Use a branch protection rule that fails the merge if the snapshot hash in the PR differs from the baseline without a new snapshot file.
If your platform supports deploy previews (Vercel, Railway, Render), wire a protocol compliance probe against every preview URL in your PR check. This catches transport misconfigurations early — a preview deploy that returns connect ECONNREFUSED on the MCP endpoint indicates a port binding issue that will also fail in production.
Rollback strategy
The verify job's rollback command (flyctl releases rollback) reverts to the last successful image. This is the right default for most teams. A more sophisticated rollback strategy uses the monitoring probe state directly:
# Rollback trigger: if AliveMCP probe fails for 3+ consecutive checks after deploy
# This is a webhook handler called by AliveMCP on status change:
app.post('/webhooks/alivemcp', (req, res) => {
const { event, dedup_key, server_slug, failure_layer } = req.body;
verifySignature(req); // always verify HMAC-SHA256 signature first
if (event === 'down' && isWithinDeployWindow(dedup_key)) {
// A failure within 10 minutes of a deploy is likely deploy-related
triggerRollback({ reason: `MCP probe failed on ${failure_layer} after deploy` });
}
res.sendStatus(200);
});
This ties the rollback decision to the external monitoring signal rather than just the CI post-deploy probe. The CI probe runs once at deploy time; AliveMCP runs continuously. If a deploy initially passes the CI probe but degrades within the first 10 minutes under real traffic, the webhook-triggered rollback catches it. See MCP server webhook alerts for the full webhook payload schema and HMAC verification pattern.
Related questions
Should CI tests run against a mock server or the real server?
Always the real server started locally in CI — never a mock. A mock cannot verify protocol compliance or schema stability because those properties are defined by the actual server implementation. Start your server on a test port, run the protocol probe to confirm it's ready, then run all three test layers. The CI overhead is 10–30 seconds for most Node.js MCP servers, which is acceptable for a gate that prevents protocol regressions from reaching production.
How do I handle secrets in PR preview deploys?
Use a separate set of test secrets with minimal permissions for preview environments. Never share production secrets with preview deploys. For tools that call external APIs, use a sandbox API key or a fixture server that returns realistic but fake data. The goal of a preview deploy is to verify the MCP session lifecycle, not to run production-scale tool calls.
What's the difference between the CI schema snapshot and AliveMCP's schema monitoring?
The CI snapshot is a pre-deployment gate — it fails the build if the schema changed without a committed baseline. AliveMCP's schema monitoring is a post-deployment continuous check — it alerts you if the live server's tools/list hash drifts from the baseline registered at deploy time. You need both: CI catches intentional but unreviewed changes before they ship; AliveMCP catches unintended drift after they ship (e.g., a runtime dependency that mutates the tool list based on configuration).
How long should the post-deploy verify window be?
Five minutes (30 attempts at 10-second intervals) is right for always-on servers. For serverless or container-on-demand platforms (Fly.io machines that scale to zero, Railway, Render), budget 8–10 minutes to account for cold-start time. If your server consistently takes more than 2 minutes to cold-start, add a health endpoint that returns 200 before the MCP layer is ready — the CI probe targets the health endpoint first, then runs the MCP probe once the health gate passes. See MCP server cold start for cold-start suppression patterns.
Further reading
- MCP server testing — protocol compliance, schema snapshots, and session integration
- MCP server deployment — post-deploy verification checklist
- MCP server environment variables — secrets injection and runtime config
- MCP server Docker — containerization and health checks
- MCP server Kubernetes — probes, PDB, and rolling deploys
- MCP server webhook alerts — payload schema and HMAC verification
- Schema drift in MCP tool definitions — detection and rollback
- AliveMCP — production monitoring that runs your probe sequence every 60 seconds