Guide · Data Persistence

MCP server database migrations

Database migrations in a conventional REST API are run once per deploy, typically by a CI/CD step before traffic shifts to the new version. An MCP server with long-lived SSE sessions adds two complications. First, migration timing relative to the ready signal: if migrations run after the process manager marks the service as ready, tool handlers execute against an outdated schema before migration completes. The migration must complete before process.send('ready') (PM2) or sd_notify(READY=1) (systemd). Second, multi-replica migration races: on a rolling update with two replicas, both new replicas start simultaneously and both attempt to run migrations — concurrent migrations on the same table produce deadlocks or duplicate migration execution. An advisory lock or a migration-runner-as-init-container pattern solves the race. This guide covers the full migration lifecycle for both single-server and multi-replica MCP deployments.

TL;DR

Run migrations synchronously before signalling ready. For multi-replica deployments, run the migration as a separate init step (Kubernetes init container, Fly.io release_command) rather than in each replica's startup sequence. Keep every migration backward-compatible — the old code version must work against the new schema during a rolling update window.

Migration tools: comparison

ToolSchema sourceMigration file formatBest for
Prisma Migrateschema.prismaAuto-generated SQLTeams using Prisma ORM
Drizzle KitTypeScript schema filesAuto-generated SQLTeams using Drizzle ORM
node-pg-migrateJavaScript migration filesJS/TS with up/downPostgreSQL, raw SQL control
db-migrateJS migration filesJS with up/downMulti-database support
Flyway / LiquibaseSQL/XML filesVersioned SQL filesEnterprise, Java ecosystem
Raw SQL + version tableSQL files in migrations/Plain SQLZero-dependency, SQLite

Single-server migration: before the ready signal

// src/server.ts — single-server startup sequence
import { execSync } from 'child_process';
import { runMigrations } from './migrations.js';

async function start() {
  console.log('[startup] Running database migrations...');

  try {
    // For Prisma: prisma migrate deploy applies all pending migrations
    // For Drizzle: drizzle-kit push or drizzle migrate
    // For custom: see the raw SQL runner below
    execSync('npx prisma migrate deploy', {
      stdio: 'inherit',
      timeout: 60_000,  // 60-second timeout — fail fast on stuck migrations
    });
    console.log('[startup] Migrations complete');
  } catch (err) {
    // Migration failure is fatal — do not signal ready
    // The process manager (PM2, systemd) will restart with the current deploy
    // If the migration itself is broken, fix it and redeploy
    console.error('[startup] Migration failed — aborting:', err);
    process.exit(1);
  }

  // Proceed only after migrations succeed
  const httpServer = await startMcpServer();

  // Signal ready — process manager allows traffic now
  if (process.send) process.send('ready');  // PM2 wait_ready
  // For systemd: notify('READY=1')
}

The process.exit(1) on migration failure is intentional. PM2 and systemd interpret a non-zero exit during startup as a failed start attempt and apply the restart policy. After StartLimitBurst failures, the service is marked as failed and alerts fire — which is correct, because a broken migration requires a human (or a new deploy with a fixed migration) to resolve.

Multi-replica migration: advisory lock pattern

When two replicas start simultaneously (Kubernetes rolling update with maxSurge:1, blue-green deploy), both attempt migrations at the same moment. For PostgreSQL, use an advisory lock to serialise the migration across replicas:

// src/migrate.ts — PostgreSQL advisory lock for multi-replica safety
import pg from 'pg';

const MIGRATION_LOCK_ID = 1234567890;  // Arbitrary integer, unique per application

export async function runMigrationsWithLock(databaseUrl: string): Promise {
  const client = new pg.Client({ connectionString: databaseUrl });
  await client.connect();

  try {
    // pg_try_advisory_lock returns true if lock acquired, false if already held
    // The lock is automatically released when this client disconnects
    const { rows } = await client.query(
      'SELECT pg_try_advisory_lock($1) AS acquired',
      [MIGRATION_LOCK_ID]
    );

    if (!rows[0].acquired) {
      // Another replica holds the lock and is running migrations — wait for it
      console.log('[migrations] Waiting for peer to complete migrations...');
      await client.query(
        'SELECT pg_advisory_lock($1)',  // Blocking wait
        [MIGRATION_LOCK_ID]
      );
      console.log('[migrations] Peer completed migrations, proceeding');
      // Lock acquired after peer finished — migrations already applied, proceed
      return;
    }

    // Lock acquired — we are the migration runner
    console.log('[migrations] Acquired migration lock, running migrations...');
    // Run your migrations here
    // execSync('npx prisma migrate deploy', { stdio: 'inherit' });
    console.log('[migrations] Migrations complete');

  } finally {
    // Release lock (also released automatically on disconnect)
    await client.query('SELECT pg_advisory_unlock($1)', [MIGRATION_LOCK_ID]);
    await client.end();
  }
}

For SQLite, concurrent multi-replica migrations are not possible — SQLite does not support multiple concurrent writers. SQLite deployments are inherently single-process, so the race condition does not arise.

Migrations as a separate init step

For Kubernetes, the cleanest approach is a migration init container that runs before the main container starts. The main containers start only after the init container exits successfully:

# kubernetes/deployment.yaml
spec:
  initContainers:
    - name: migrate
      image: your-mcp-server:latest
      command: ["npx", "prisma", "migrate", "deploy"]
      env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: mcp-secrets
              key: DATABASE_URL
  containers:
    - name: mcp-server
      image: your-mcp-server:latest
      # Main container starts only after migrate init container exits 0
      # No migration logic needed in the application startup sequence

For Fly.io, the equivalent is release_command in fly.toml:

[deploy]
  release_command = "npx prisma migrate deploy"
  # Fly runs this command before routing traffic to the new machines
  # If it exits non-zero, the deploy is aborted and traffic stays on the old machines

Backward-compatible migration patterns

During a rolling update, the old code version and new code version run simultaneously for a window of 10–60 seconds. Any migration that removes a column used by old code, or adds a NOT NULL column without a default, breaks old code during this window. The backward-compatible migration sequence for adding a column:

StepDeployMigrationCode change
1Deploy NAdd column with DEFAULT (nullable or default value)Old code ignores new column
2Deploy N+1Backfill column values if neededNew code reads and writes new column
3Deploy N+2Add NOT NULL constraint (optional)Code depends on column being present

The backward-compatible pattern for removing a column: remove code references first (Deploy N), then drop the column (Deploy N+1). Never remove a column in the same deploy that removes the last code reference — the rolling update window means old code instances hit the new schema for tens of seconds.

Raw SQL migration runner (zero dependencies)

// src/migrations.ts — minimal migration runner for SQLite (no ORM required)
import Database from 'better-sqlite3';
import fs from 'fs';
import path from 'path';

export function runMigrations(db: Database.Database, migrationsDir: string): void {
  // Track applied migrations in a dedicated table
  db.exec(`
    CREATE TABLE IF NOT EXISTS _migrations (
      id         INTEGER PRIMARY KEY AUTOINCREMENT,
      filename   TEXT    NOT NULL UNIQUE,
      applied_at INTEGER NOT NULL DEFAULT (unixepoch())
    )
  `);

  const applied = new Set(
    (db.prepare('SELECT filename FROM _migrations').all() as { filename: string }[])
      .map(r => r.filename)
  );

  const files = fs.readdirSync(migrationsDir)
    .filter(f => f.endsWith('.sql'))
    .sort();  // Alphabetical order: 001_initial.sql, 002_add_users.sql, ...

  for (const file of files) {
    if (applied.has(file)) continue;

    const sql = fs.readFileSync(path.join(migrationsDir, file), 'utf8');

    // Each migration runs in a transaction — partial migration is rolled back
    const migrate = db.transaction(() => {
      db.exec(sql);
      db.prepare('INSERT INTO _migrations (filename) VALUES (?)').run(file);
    });

    console.log(`[migrations] Applying ${file}...`);
    migrate();
    console.log(`[migrations] Applied ${file}`);
  }
}

AliveMCP and migration failures

A deploy where the migration succeeds but introduces a schema mismatch with a tool handler (e.g., a column was renamed) shows up to AliveMCP as tool call errors rather than server downtime. The MCP server is up and responding to initialize, but tools/call returns isError: true. AliveMCP detects this within the next probe cycle and alerts on tool failure rate, giving you a 60-second detection window — much faster than user reports. This pattern distinguishes a bad migration from a full deployment failure.

Related questions

Can I roll back a Prisma migration?

Prisma Migrate does not generate rollback SQL automatically. To roll back, write a new migration that reverses the change (drop the column you added, recreate the table you altered) and deploy it. This is intentional — rollback SQL is error-prone when written automatically and rarely needed if the backward-compatible migration pattern is followed. For emergencies, maintain a migrations/emergency-rollback.sql file manually for destructive changes (column drops, table renames).

What is the difference between prisma migrate dev and prisma migrate deploy?

prisma migrate dev creates new migration files from schema changes and applies them — it is for development. prisma migrate deploy applies existing migration files without creating new ones — it is for production. Never run prisma migrate dev in production; it can create unexpected migration files and modify the schema tracking table in ways that break production migration state.

How do I handle a migration that takes too long?

Long-running migrations (adding an index to a large table, backfilling a new column) block the startup sequence. For PostgreSQL, use CREATE INDEX CONCURRENTLY (which does not lock the table but cannot run inside a transaction — use a separate migration file marked with -- prisma-migrate: disable-transactions). For backfills, run the backfill as a background job after startup rather than in the migration file itself. See MCP server zero-downtime deployment for the full deploy strategy that accommodates long migrations.

Further reading