Guide · Configuration

MCP server configuration management

Configuration management for MCP servers is deceptively easy to get wrong. The most common failure mode is not bad configuration — it is undetected misconfiguration: a missing environment variable that silently causes a feature to be disabled, or a secret that gets logged because someone called console.log(process.env) during debugging and never removed it. The goal of a good configuration system is to make misconfiguration impossible to miss: validate everything at startup, fail loudly before accepting connections, and never let a partially configured server serve requests to clients.

TL;DR

Define a Zod schema for all environment variables. Call parseConfig() inside createDeps() before opening any connections — z.parse() throws on any missing or malformed value, and the process exits before app.listen. Store the parsed config in the Deps object so every tool handler has typed access to config values without reading process.env directly. Never log the config object — log a redacted summary that omits secret values. AliveMCP's probe measures startup latency: a server that exits on config validation failure shows up immediately as a probe failure, which is exactly the right signal.

The fail-fast config schema pattern

Read and validate all configuration at process start, before opening any connections or registering any tools. Use Zod for the schema — it produces clear error messages that name the missing variable and explain the expected type:

// config.ts — parse and validate all environment variables at startup
import { z } from 'zod';

const configSchema = z.object({
  // Server
  PORT: z.coerce.number().int().min(1).max(65535).default(3000),
  NODE_ENV: z.enum(['development', 'test', 'production']).default('development'),

  // Database — required, no default
  DATABASE_URL: z.string().url('DATABASE_URL must be a valid connection string'),

  // Redis — optional; null disables caching and rate limiting
  REDIS_URL: z.string().url().optional(),

  // Secrets
  API_SECRET: z.string().min(32, 'API_SECRET must be at least 32 characters'),

  // External API
  OPENAI_API_KEY: z.string().startsWith('sk-').optional(),

  // Feature behaviour
  MAX_TOOL_CONCURRENCY: z.coerce.number().int().min(1).max(100).default(10),
  PROBE_API_KEY: z.string().optional(), // AliveMCP probe API key for unauthenticated health checks
});

export type AppConfig = z.infer;

export function parseConfig(): AppConfig {
  const result = configSchema.safeParse(process.env);
  if (!result.success) {
    // Format errors as a list of "VAR_NAME: message" lines
    const errors = result.error.issues
      .map(i => `  ${i.path.join('.')}: ${i.message}`)
      .join('\n');
    throw new Error(`Configuration error — fix before starting:\n${errors}`);
  }
  return result.data;
}

Call parseConfig() at the start of createDeps(). If any required variable is missing, the error message names exactly which variable is missing and why — no hunting through undefined stack traces at runtime:

// deps.ts
import { parseConfig, type AppConfig } from './config';

export interface Deps {
  config: AppConfig;
  db: Pool;
  cache: Redis | null;
  logger: Logger;
}

export async function createDeps(): Promise<Deps> {
  const config = parseConfig(); // throws and exits on bad config

  const db = new Pool({ connectionString: config.DATABASE_URL });
  await db.query('SELECT 1'); // connectivity check

  const cache = config.REDIS_URL
    ? new Redis(config.REDIS_URL, { maxRetriesPerRequest: 3 })
    : null;

  const logger = buildLogger(config.NODE_ENV);

  return { config, db, cache, logger };
}

What to log vs. what to redact

Secrets in logs are a common incident root cause — a connection string or API key that ends up in a log aggregator is often the first thing found in a post-mortem. Log a redacted startup summary instead of the raw config object:

// Log after parseConfig(), before app.listen()
function logConfigSummary(config: AppConfig) {
  console.info({
    event: 'config_loaded',
    port: config.PORT,
    node_env: config.NODE_ENV,
    database_url: config.DATABASE_URL.replace(/:\/\/[^@]+@/, '://***@'), // redact credentials
    redis: config.REDIS_URL ? 'configured' : 'disabled',
    api_secret: `[${config.API_SECRET.length} chars]`,
    openai: config.OPENAI_API_KEY ? 'configured' : 'not set',
  });
}

The redacted summary confirms which config paths were resolved without exposing the values. DATABASE_URL.replace(/:\/\/[^@]+@/, '://***@') removes the user:password@ segment from PostgreSQL connection strings while keeping the host and database name visible for debugging.

Secrets management: beyond environment variables

Environment variables are convenient but have limitations: they are visible to all processes running as the same user, they end up in shell history if set inline, and they cannot be rotated without a restart. For production deployments, use a secrets manager and inject values at startup:

Secret source	Rotation support	How to integrate
Environment variable	Requires restart	`process.env.MY_SECRET` — simple, works for most cases
AWS Secrets Manager	Yes, via `secretsmanager.getSecretValue()`	Fetch in `createDeps()`, store in config, re-fetch on rotation event
HashiCorp Vault	Yes, via dynamic secrets and lease renewal	Vault agent sidecar writes to a file; `fs.readFileSync` in `createDeps()`
Kubernetes Secret	Restart or projected volume auto-update	Mount as file, read with `fs.readFileSync` — avoids env var exposure in `kubectl describe pod`

Regardless of the source, inject secrets through the Deps object rather than accessing them inside tool handlers. A tool handler that calls getSecretValue() directly adds latency to every call and creates a dependency on the secrets manager that bypasses the pool and error-handling patterns you've established in createDeps().

Dynamic configuration reload without restart

Some configuration changes should not require a restart: feature flag values, rate limit thresholds, log verbosity. Reload-without-restart has two safe patterns:

// Pattern 1: File watcher (for config files, not secrets)
import { watch } from 'node:fs';

interface DynamicConfig {
  log_level: 'debug' | 'info' | 'warn' | 'error';
  rate_limit_rpm: number;
}

let dynamicConfig: DynamicConfig = loadDynamicConfig();

function loadDynamicConfig(): DynamicConfig {
  try {
    const raw = JSON.parse(readFileSync('/etc/myserver/dynamic.json', 'utf8'));
    return dynamicConfigSchema.parse(raw); // Zod validates here too
  } catch {
    return { log_level: 'info', rate_limit_rpm: 100 }; // safe default on parse failure
  }
}

watch('/etc/myserver/dynamic.json', () => {
  const next = loadDynamicConfig();
  console.info({ event: 'dynamic_config_reloaded', ...next });
  dynamicConfig = next;
});

// Tool handlers read dynamicConfig (a plain object reference, not a Deps field)
// The reference swap is atomic in V8's single-threaded model

// Pattern 2: Redis pub/sub config channel (for distributed deployments)
// Publisher: redis-cli publish config-updates '{"rate_limit_rpm": 200}'
const subscriber = cache.duplicate(); // dedicated connection for pub/sub
await subscriber.subscribe('config-updates', (message) => {
  try {
    const patch = JSON.parse(message);
    dynamicConfig = { ...dynamicConfig, ...dynamicConfigSchema.partial().parse(patch) };
    console.info({ event: 'config_patched_via_redis', patch });
  } catch (err) {
    console.error({ event: 'config_patch_rejected', err });
  }
});

Static configuration (database URLs, secrets, server port) should never be reloaded at runtime — these require re-establishing connections and are better served by a rolling restart. Reserve dynamic reload for leaf configuration values that affect behavior within an already-established connection.

Per-tenant configuration in multi-tenant MCP servers

In multi-tenant deployments, some configuration values differ per tenant. The key rule is the same as for any per-tenant state: never store per-tenant config in module scope. Load it at initialize time and attach it to the tenant context:

// Multi-tenant config pattern
interface TenantConfig {
  tenant_id: string;
  max_results_per_call: number;   // different limits per pricing tier
  allowed_tools: Set<string>;     // tool access control
  external_api_endpoint: string;  // tenant-specific endpoint
}

// In the initialize handler, load tenant config from DB and store in session map
const tenantContexts = new Map<string, TenantConfig>();

mcpTransport.on('session', (session) => {
  session.on('initialize', async (params) => {
    const tenantId = extractTenantId(params); // from JWT or API key
    const tenantConfig = await loadTenantConfig(deps.db, tenantId);
    tenantContexts.set(session.id, tenantConfig);
  });
  session.on('close', () => {
    tenantContexts.delete(session.id); // always clean up to prevent map growth
  });
});

Store the tenant config in the session context map, not in module scope. The module-scope mistake — setting a global currentTenant variable in the initialize handler — causes a silent race: two concurrent sessions set currentTenant to different values, and one session's tool calls run with another tenant's config. The session context map (keyed by session ID) is the correct isolation boundary.

AliveMCP probe and configuration failures

AliveMCP probes your server with an initialize + tools/list request every minute. Configuration validation failures manifest as specific probe signals:

Server exits before listening: Process exits during createDeps() because parseConfig() threw. AliveMCP sees connection refused — probe marks the server DOWN immediately.
Server starts but DB is unreachable: createDeps()'s db.query('SELECT 1') hangs until timeout, then throws. Server exits. Probe sees DOWN.
Dynamic config reload fails: The reload falls back to safe defaults (as shown above). Server stays UP, but behavior changes. AliveMCP cannot see this — add a health_check tool that returns the active config values (redacted) so you can inspect config state from any MCP client.

The fail-fast pattern — validate config, verify connectivity, then listen — means AliveMCP gets a clean binary signal: the server is either fully configured and ready, or it never starts. This is more useful than a server that starts with bad config and fails per-call.