Guide · Deployment

MCP server feature flags

Feature flags for MCP servers solve a different problem than feature flags for web applications. A web page renders once per request — you can gate a UI feature behind a flag and only users who see the request see the feature. An MCP server exposes a tool surface that clients cache and depend on for the lifetime of a session. Changing which tools are registered, or changing a tool's schema, mid-flight requires care. Flags that control tool registration belong at session initialisation time. Flags that control tool behaviour can be evaluated per call. Understanding which is which prevents the most disruptive category of MCP flag-related bugs: a client that cached one tool list but is suddenly calling tools that no longer exist.

TL;DR

Evaluate tool-registration flags at initialize time, per session, so each session gets a consistent tool surface for its lifetime. Evaluate per-call behaviour flags inside the tool handler on each invocation. For simple deployments, use a comma-separated environment variable (ENABLED_FEATURES=export_pdf,v2_search) parsed at startup. For runtime flag changes without restart, use a Redis-backed flag store with a pub/sub invalidation channel. Per-tenant flags — where enterprise tenants have access to more tools — belong in the session context map, evaluated from the tenant's database row at initialize time. AliveMCP probes detect when a flag change silently breaks tool registration: if the server starts but tools/list returns an unexpected tool count, the probe latency profile changes.

Two categories of flags in MCP servers

The distinction that matters most for MCP servers is when the flag is evaluated:

Flag category	When evaluated	What it controls	Example
Tool-registration flags	At `initialize` (once per session)	Which tools the session can call	`enable_pdf_export`, `v2_search_tool`
Behaviour flags	Per tool call	How a registered tool operates	`use_semantic_search`, `verbose_output`
Infrastructure flags	At process start	Which adapters and connections to open	`use_redis_cache`, `enable_queue`

Infrastructure flags must be evaluated at startup because they determine what connections createDeps() opens. Changing them requires a restart. Tool-registration flags should be evaluated at initialize time — not at startup — so that different sessions (or different tenants sharing the same server) can have different tool surfaces without a restart. Behaviour flags can be evaluated on every call because they do not affect the schema that clients cache.

Simple environment-variable flags for single-tenant deployments

For servers where all sessions share the same flag state, parse flags from an environment variable at startup and evaluate them at initialize time:

// flags.ts — parse once at startup, evaluate at session time
const ENABLED_FEATURES = new Set(
  (process.env.ENABLED_FEATURES ?? '')
    .split(',')
    .map(s => s.trim())
    .filter(Boolean)
);

export function isEnabled(flag: string): boolean {
  return ENABLED_FEATURES.has(flag);
}

// In createDeps() — infrastructure flags evaluated here:
export async function createDeps(): Promise<Deps> {
  const config = parseConfig();
  const useCache = isEnabled('redis_cache'); // infrastructure flag
  const cache = useCache ? new Redis(config.REDIS_URL!) : null;
  // ...
}

// registerTools.ts — tool-registration flags evaluated per session
export function registerToolsForSession(server: McpServer, flags: Set<string>) {
  // Base tools registered for all sessions
  registerSearchTools(server, deps);
  registerReadTools(server, deps);

  // Flagged tools — only registered when flag is on
  if (flags.has('v2_search')) {
    registerSearchV2Tools(server, deps);
  }
  if (flags.has('pdf_export')) {
    registerPdfExportTools(server, deps);
  }
}

// At initialize time, resolve the session's flag set and register tools:
session.on('initialize', async () => {
  const sessionFlags = new Set(ENABLED_FEATURES); // copy; could add per-session overrides here
  const sessionServer = new McpServer({ name: 'myserver', version: '1.0.0' });
  registerToolsForSession(sessionServer, sessionFlags);
});

Runtime flag changes without restart: Redis-backed flags

Environment-variable flags require a restart to change. For flags that need to flip in production without a deployment, store them in Redis and subscribe to a pub/sub channel for invalidation:

// flag-store.ts — Redis-backed flags with in-memory cache and pub/sub invalidation
import Redis from 'ioredis';

const FLAG_KEY = 'feature-flags'; // Redis hash key
const FLAG_CHANNEL = 'flag-updates';

let cachedFlags: Record<string, boolean> = {};

export async function initFlagStore(redis: Redis): Promise<void> {
  // Load current flags from Redis hash
  const raw = await redis.hgetall(FLAG_KEY);
  cachedFlags = Object.fromEntries(
    Object.entries(raw).map(([k, v]) => [k, v === 'true'])
  );

  // Subscribe to invalidation channel on a dedicated connection
  const subscriber = redis.duplicate();
  await subscriber.subscribe(FLAG_CHANNEL, (message) => {
    try {
      const patch = JSON.parse(message) as Record<string, boolean>;
      cachedFlags = { ...cachedFlags, ...patch };
      console.info({ event: 'flags_updated', patch });
    } catch {
      console.error({ event: 'flag_update_parse_error', message });
    }
  });
}

export function isFlagEnabled(flag: string, defaultValue = false): boolean {
  return cachedFlags[flag] ?? defaultValue;
}

// Flip a flag from any admin tool or CLI:
// redis-cli hset feature-flags pdf_export true
// redis-cli publish flag-updates '{"pdf_export": true}'

The pub/sub channel propagates the change to all server instances within milliseconds. The in-memory cache avoids a Redis round-trip on every tool call. The important constraint: changing a tool-registration flag via this mechanism does not change the tool surface of sessions that have already initialised — existing sessions keep their original tool list for their lifetime. Only new sessions pick up the new flag state. This is correct behaviour: it is safer than ejecting active sessions.

Per-tenant feature flags

Enterprise tiers often include tools that basic tiers do not. Per-tenant flags live in the tenant's database row and are resolved at initialize time into a Set<string> that drives registerToolsForSession:

// Per-tenant flag resolution at initialize
session.on('initialize', async (params) => {
  const tenantId = extractTenantId(params); // from JWT, API key, or metadata

  // Load tenant's enabled features from DB — one query per session, not per call
  const { rows } = await deps.db.query<{ feature: string }>(
    'SELECT feature FROM tenant_features WHERE tenant_id = $1 AND enabled = true',
    [tenantId]
  );
  const tenantFlags = new Set(rows.map(r => r.feature));

  // Merge global flags with tenant flags
  const effectiveFlags = new Set([...globalFlags(), ...tenantFlags]);

  // Register tools for this session only
  const sessionServer = buildSessionServer(effectiveFlags);

  tenantContexts.set(session.id, { tenantId, flags: effectiveFlags, server: sessionServer });
});

session.on('close', () => {
  tenantContexts.delete(session.id);
});

Loading flags from the database on every initialize adds one query per new session. For servers with high session churn (many short sessions), cache the tenant flag set in Redis with a TTL of a few minutes — the slight staleness is acceptable because session tool-registration flags are intentionally per-session, and a tenant who just had a plan upgrade will get the new tools on their next session.

Gradual rollout with percentage-based flags

A percentage-based rollout gates a new tool for a random fraction of sessions. Use a stable hash of the session ID or tenant ID so the same session consistently sees the same flag state:

// Stable percentage rollout — same entity always gets the same bucket
import { createHash } from 'node:crypto';

function inRollout(entityId: string, flagName: string, percent: number): boolean {
  const hash = createHash('sha256')
    .update(`${flagName}:${entityId}`)
    .digest('hex');
  // Convert first 4 hex chars to a number in [0, 65535], then to [0, 100]
  const bucket = parseInt(hash.slice(0, 4), 16) % 100;
  return bucket < percent;
}

// Usage: gate the v2_search tool for 10% of tenants
if (inRollout(tenantId, 'v2_search', 10)) {
  registerSearchV2Tools(sessionServer, deps);
}

The hash approach gives stable bucketing: increasing the percentage from 10% to 20% adds 10% of entities to the enabled bucket without flipping any of the original 10%. Entities that had the flag enabled continue to have it enabled. This is important for MCP sessions where clients build context using the current tool surface — you do not want a session to lose a tool mid-conversation because a random bucket assignment flipped.

Monitoring tool-surface changes with AliveMCP

AliveMCP's probe calls tools/list after initialize. If you change a tool-registration flag globally (e.g., enabling pdf_export for all sessions), the next probe will return a different tool count. Configure an AliveMCP alert on tools/list tool count changes to detect unintended tool-surface changes — a flag deployment that accidentally disabled a tool will fire the alert within one probe cycle. This is complementary to a schema snapshot CI gate: the CI gate catches changes before deployment, and AliveMCP catches unexpected changes in production.