Guide · Data Persistence
MCP server database migrations
Database migrations in a conventional REST API are run once per deploy, typically by a CI/CD step before traffic shifts to the new version. An MCP server with long-lived SSE sessions adds two complications. First, migration timing relative to the ready signal: if migrations run after the process manager marks the service as ready, tool handlers execute against an outdated schema before migration completes. The migration must complete before process.send('ready') (PM2) or sd_notify(READY=1) (systemd). Second, multi-replica migration races: on a rolling update with two replicas, both new replicas start simultaneously and both attempt to run migrations — concurrent migrations on the same table produce deadlocks or duplicate migration execution. An advisory lock or a migration-runner-as-init-container pattern solves the race. This guide covers the full migration lifecycle for both single-server and multi-replica MCP deployments.
TL;DR
Run migrations synchronously before signalling ready. For multi-replica deployments, run the migration as a separate init step (Kubernetes init container, Fly.io release_command) rather than in each replica's startup sequence. Keep every migration backward-compatible — the old code version must work against the new schema during a rolling update window.
Migration tools: comparison
| Tool | Schema source | Migration file format | Best for |
|---|---|---|---|
| Prisma Migrate | schema.prisma | Auto-generated SQL | Teams using Prisma ORM |
| Drizzle Kit | TypeScript schema files | Auto-generated SQL | Teams using Drizzle ORM |
| node-pg-migrate | JavaScript migration files | JS/TS with up/down | PostgreSQL, raw SQL control |
| db-migrate | JS migration files | JS with up/down | Multi-database support |
| Flyway / Liquibase | SQL/XML files | Versioned SQL files | Enterprise, Java ecosystem |
| Raw SQL + version table | SQL files in migrations/ | Plain SQL | Zero-dependency, SQLite |
Single-server migration: before the ready signal
// src/server.ts — single-server startup sequence
import { execSync } from 'child_process';
import { runMigrations } from './migrations.js';
async function start() {
console.log('[startup] Running database migrations...');
try {
// For Prisma: prisma migrate deploy applies all pending migrations
// For Drizzle: drizzle-kit push or drizzle migrate
// For custom: see the raw SQL runner below
execSync('npx prisma migrate deploy', {
stdio: 'inherit',
timeout: 60_000, // 60-second timeout — fail fast on stuck migrations
});
console.log('[startup] Migrations complete');
} catch (err) {
// Migration failure is fatal — do not signal ready
// The process manager (PM2, systemd) will restart with the current deploy
// If the migration itself is broken, fix it and redeploy
console.error('[startup] Migration failed — aborting:', err);
process.exit(1);
}
// Proceed only after migrations succeed
const httpServer = await startMcpServer();
// Signal ready — process manager allows traffic now
if (process.send) process.send('ready'); // PM2 wait_ready
// For systemd: notify('READY=1')
}
The process.exit(1) on migration failure is intentional. PM2 and systemd interpret a non-zero exit during startup as a failed start attempt and apply the restart policy. After StartLimitBurst failures, the service is marked as failed and alerts fire — which is correct, because a broken migration requires a human (or a new deploy with a fixed migration) to resolve.
Multi-replica migration: advisory lock pattern
When two replicas start simultaneously (Kubernetes rolling update with maxSurge:1, blue-green deploy), both attempt migrations at the same moment. For PostgreSQL, use an advisory lock to serialise the migration across replicas:
// src/migrate.ts — PostgreSQL advisory lock for multi-replica safety
import pg from 'pg';
const MIGRATION_LOCK_ID = 1234567890; // Arbitrary integer, unique per application
export async function runMigrationsWithLock(databaseUrl: string): Promise {
const client = new pg.Client({ connectionString: databaseUrl });
await client.connect();
try {
// pg_try_advisory_lock returns true if lock acquired, false if already held
// The lock is automatically released when this client disconnects
const { rows } = await client.query(
'SELECT pg_try_advisory_lock($1) AS acquired',
[MIGRATION_LOCK_ID]
);
if (!rows[0].acquired) {
// Another replica holds the lock and is running migrations — wait for it
console.log('[migrations] Waiting for peer to complete migrations...');
await client.query(
'SELECT pg_advisory_lock($1)', // Blocking wait
[MIGRATION_LOCK_ID]
);
console.log('[migrations] Peer completed migrations, proceeding');
// Lock acquired after peer finished — migrations already applied, proceed
return;
}
// Lock acquired — we are the migration runner
console.log('[migrations] Acquired migration lock, running migrations...');
// Run your migrations here
// execSync('npx prisma migrate deploy', { stdio: 'inherit' });
console.log('[migrations] Migrations complete');
} finally {
// Release lock (also released automatically on disconnect)
await client.query('SELECT pg_advisory_unlock($1)', [MIGRATION_LOCK_ID]);
await client.end();
}
}
For SQLite, concurrent multi-replica migrations are not possible — SQLite does not support multiple concurrent writers. SQLite deployments are inherently single-process, so the race condition does not arise.
Migrations as a separate init step
For Kubernetes, the cleanest approach is a migration init container that runs before the main container starts. The main containers start only after the init container exits successfully:
# kubernetes/deployment.yaml
spec:
initContainers:
- name: migrate
image: your-mcp-server:latest
command: ["npx", "prisma", "migrate", "deploy"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: mcp-secrets
key: DATABASE_URL
containers:
- name: mcp-server
image: your-mcp-server:latest
# Main container starts only after migrate init container exits 0
# No migration logic needed in the application startup sequence
For Fly.io, the equivalent is release_command in fly.toml:
[deploy]
release_command = "npx prisma migrate deploy"
# Fly runs this command before routing traffic to the new machines
# If it exits non-zero, the deploy is aborted and traffic stays on the old machines
Backward-compatible migration patterns
During a rolling update, the old code version and new code version run simultaneously for a window of 10–60 seconds. Any migration that removes a column used by old code, or adds a NOT NULL column without a default, breaks old code during this window. The backward-compatible migration sequence for adding a column:
| Step | Deploy | Migration | Code change |
|---|---|---|---|
| 1 | Deploy N | Add column with DEFAULT (nullable or default value) | Old code ignores new column |
| 2 | Deploy N+1 | Backfill column values if needed | New code reads and writes new column |
| 3 | Deploy N+2 | Add NOT NULL constraint (optional) | Code depends on column being present |
The backward-compatible pattern for removing a column: remove code references first (Deploy N), then drop the column (Deploy N+1). Never remove a column in the same deploy that removes the last code reference — the rolling update window means old code instances hit the new schema for tens of seconds.
Raw SQL migration runner (zero dependencies)
// src/migrations.ts — minimal migration runner for SQLite (no ORM required)
import Database from 'better-sqlite3';
import fs from 'fs';
import path from 'path';
export function runMigrations(db: Database.Database, migrationsDir: string): void {
// Track applied migrations in a dedicated table
db.exec(`
CREATE TABLE IF NOT EXISTS _migrations (
id INTEGER PRIMARY KEY AUTOINCREMENT,
filename TEXT NOT NULL UNIQUE,
applied_at INTEGER NOT NULL DEFAULT (unixepoch())
)
`);
const applied = new Set(
(db.prepare('SELECT filename FROM _migrations').all() as { filename: string }[])
.map(r => r.filename)
);
const files = fs.readdirSync(migrationsDir)
.filter(f => f.endsWith('.sql'))
.sort(); // Alphabetical order: 001_initial.sql, 002_add_users.sql, ...
for (const file of files) {
if (applied.has(file)) continue;
const sql = fs.readFileSync(path.join(migrationsDir, file), 'utf8');
// Each migration runs in a transaction — partial migration is rolled back
const migrate = db.transaction(() => {
db.exec(sql);
db.prepare('INSERT INTO _migrations (filename) VALUES (?)').run(file);
});
console.log(`[migrations] Applying ${file}...`);
migrate();
console.log(`[migrations] Applied ${file}`);
}
}
AliveMCP and migration failures
A deploy where the migration succeeds but introduces a schema mismatch with a tool handler (e.g., a column was renamed) shows up to AliveMCP as tool call errors rather than server downtime. The MCP server is up and responding to initialize, but tools/call returns isError: true. AliveMCP detects this within the next probe cycle and alerts on tool failure rate, giving you a 60-second detection window — much faster than user reports. This pattern distinguishes a bad migration from a full deployment failure.
Related questions
Can I roll back a Prisma migration?
Prisma Migrate does not generate rollback SQL automatically. To roll back, write a new migration that reverses the change (drop the column you added, recreate the table you altered) and deploy it. This is intentional — rollback SQL is error-prone when written automatically and rarely needed if the backward-compatible migration pattern is followed. For emergencies, maintain a migrations/emergency-rollback.sql file manually for destructive changes (column drops, table renames).
What is the difference between prisma migrate dev and prisma migrate deploy?
prisma migrate dev creates new migration files from schema changes and applies them — it is for development. prisma migrate deploy applies existing migration files without creating new ones — it is for production. Never run prisma migrate dev in production; it can create unexpected migration files and modify the schema tracking table in ways that break production migration state.
How do I handle a migration that takes too long?
Long-running migrations (adding an index to a large table, backfilling a new column) block the startup sequence. For PostgreSQL, use CREATE INDEX CONCURRENTLY (which does not lock the table but cannot run inside a transaction — use a separate migration file marked with -- prisma-migrate: disable-transactions). For backfills, run the backfill as a background job after startup rather than in the migration file itself. See MCP server zero-downtime deployment for the full deploy strategy that accommodates long migrations.
Further reading
- MCP server SQLite — embedded database migrations with the raw SQL runner
- MCP server Prisma — Prisma Migrate integration with the startup sequence
- MCP server Drizzle ORM — Drizzle Kit migration generation and deployment
- MCP server zero-downtime deployment — backward-compatible migrations during rolling updates
- MCP server PM2 — wait_ready pattern that holds traffic until migrations complete
- MCP server systemd — Type=notify equivalent that delays traffic until migrations complete
- MCP server Fly.io — release_command for migrations before Fly routes traffic
- MCP server deployment — full deploy workflow including migration step
- AliveMCP — detects schema mismatch errors from bad migrations within 60 seconds