Guide · Configuration

MCP server configuration management

Configuration management for MCP servers is deceptively easy to get wrong. The most common failure mode is not bad configuration — it is undetected misconfiguration: a missing environment variable that silently causes a feature to be disabled, or a secret that gets logged because someone called console.log(process.env) during debugging and never removed it. The goal of a good configuration system is to make misconfiguration impossible to miss: validate everything at startup, fail loudly before accepting connections, and never let a partially configured server serve requests to clients.

TL;DR

Define a Zod schema for all environment variables. Call parseConfig() inside createDeps() before opening any connections — z.parse() throws on any missing or malformed value, and the process exits before app.listen. Store the parsed config in the Deps object so every tool handler has typed access to config values without reading process.env directly. Never log the config object — log a redacted summary that omits secret values. AliveMCP's probe measures startup latency: a server that exits on config validation failure shows up immediately as a probe failure, which is exactly the right signal.

The fail-fast config schema pattern

Read and validate all configuration at process start, before opening any connections or registering any tools. Use Zod for the schema — it produces clear error messages that name the missing variable and explain the expected type:

// config.ts — parse and validate all environment variables at startup
import { z } from 'zod';

const configSchema = z.object({
  // Server
  PORT: z.coerce.number().int().min(1).max(65535).default(3000),
  NODE_ENV: z.enum(['development', 'test', 'production']).default('development'),

  // Database — required, no default
  DATABASE_URL: z.string().url('DATABASE_URL must be a valid connection string'),

  // Redis — optional; null disables caching and rate limiting
  REDIS_URL: z.string().url().optional(),

  // Secrets
  API_SECRET: z.string().min(32, 'API_SECRET must be at least 32 characters'),

  // External API
  OPENAI_API_KEY: z.string().startsWith('sk-').optional(),

  // Feature behaviour
  MAX_TOOL_CONCURRENCY: z.coerce.number().int().min(1).max(100).default(10),
  PROBE_API_KEY: z.string().optional(), // AliveMCP probe API key for unauthenticated health checks
});

export type AppConfig = z.infer;

export function parseConfig(): AppConfig {
  const result = configSchema.safeParse(process.env);
  if (!result.success) {
    // Format errors as a list of "VAR_NAME: message" lines
    const errors = result.error.issues
      .map(i => `  ${i.path.join('.')}: ${i.message}`)
      .join('\n');
    throw new Error(`Configuration error — fix before starting:\n${errors}`);
  }
  return result.data;
}

Call parseConfig() at the start of createDeps(). If any required variable is missing, the error message names exactly which variable is missing and why — no hunting through undefined stack traces at runtime:

// deps.ts
import { parseConfig, type AppConfig } from './config';

export interface Deps {
  config: AppConfig;
  db: Pool;
  cache: Redis | null;
  logger: Logger;
}

export async function createDeps(): Promise<Deps> {
  const config = parseConfig(); // throws and exits on bad config

  const db = new Pool({ connectionString: config.DATABASE_URL });
  await db.query('SELECT 1'); // connectivity check

  const cache = config.REDIS_URL
    ? new Redis(config.REDIS_URL, { maxRetriesPerRequest: 3 })
    : null;

  const logger = buildLogger(config.NODE_ENV);

  return { config, db, cache, logger };
}

What to log vs. what to redact

Secrets in logs are a common incident root cause — a connection string or API key that ends up in a log aggregator is often the first thing found in a post-mortem. Log a redacted startup summary instead of the raw config object:

// Log after parseConfig(), before app.listen()
function logConfigSummary(config: AppConfig) {
  console.info({
    event: 'config_loaded',
    port: config.PORT,
    node_env: config.NODE_ENV,
    database_url: config.DATABASE_URL.replace(/:\/\/[^@]+@/, '://***@'), // redact credentials
    redis: config.REDIS_URL ? 'configured' : 'disabled',
    api_secret: `[${config.API_SECRET.length} chars]`,
    openai: config.OPENAI_API_KEY ? 'configured' : 'not set',
  });
}

The redacted summary confirms which config paths were resolved without exposing the values. DATABASE_URL.replace(/:\/\/[^@]+@/, '://***@') removes the user:password@ segment from PostgreSQL connection strings while keeping the host and database name visible for debugging.

Secrets management: beyond environment variables

Environment variables are convenient but have limitations: they are visible to all processes running as the same user, they end up in shell history if set inline, and they cannot be rotated without a restart. For production deployments, use a secrets manager and inject values at startup:

Secret sourceRotation supportHow to integrate
Environment variableRequires restartprocess.env.MY_SECRET — simple, works for most cases
AWS Secrets ManagerYes, via secretsmanager.getSecretValue()Fetch in createDeps(), store in config, re-fetch on rotation event
HashiCorp VaultYes, via dynamic secrets and lease renewalVault agent sidecar writes to a file; fs.readFileSync in createDeps()
Kubernetes SecretRestart or projected volume auto-updateMount as file, read with fs.readFileSync — avoids env var exposure in kubectl describe pod

Regardless of the source, inject secrets through the Deps object rather than accessing them inside tool handlers. A tool handler that calls getSecretValue() directly adds latency to every call and creates a dependency on the secrets manager that bypasses the pool and error-handling patterns you've established in createDeps().

Dynamic configuration reload without restart

Some configuration changes should not require a restart: feature flag values, rate limit thresholds, log verbosity. Reload-without-restart has two safe patterns:

// Pattern 1: File watcher (for config files, not secrets)
import { watch } from 'node:fs';

interface DynamicConfig {
  log_level: 'debug' | 'info' | 'warn' | 'error';
  rate_limit_rpm: number;
}

let dynamicConfig: DynamicConfig = loadDynamicConfig();

function loadDynamicConfig(): DynamicConfig {
  try {
    const raw = JSON.parse(readFileSync('/etc/myserver/dynamic.json', 'utf8'));
    return dynamicConfigSchema.parse(raw); // Zod validates here too
  } catch {
    return { log_level: 'info', rate_limit_rpm: 100 }; // safe default on parse failure
  }
}

watch('/etc/myserver/dynamic.json', () => {
  const next = loadDynamicConfig();
  console.info({ event: 'dynamic_config_reloaded', ...next });
  dynamicConfig = next;
});

// Tool handlers read dynamicConfig (a plain object reference, not a Deps field)
// The reference swap is atomic in V8's single-threaded model
// Pattern 2: Redis pub/sub config channel (for distributed deployments)
// Publisher: redis-cli publish config-updates '{"rate_limit_rpm": 200}'
const subscriber = cache.duplicate(); // dedicated connection for pub/sub
await subscriber.subscribe('config-updates', (message) => {
  try {
    const patch = JSON.parse(message);
    dynamicConfig = { ...dynamicConfig, ...dynamicConfigSchema.partial().parse(patch) };
    console.info({ event: 'config_patched_via_redis', patch });
  } catch (err) {
    console.error({ event: 'config_patch_rejected', err });
  }
});

Static configuration (database URLs, secrets, server port) should never be reloaded at runtime — these require re-establishing connections and are better served by a rolling restart. Reserve dynamic reload for leaf configuration values that affect behavior within an already-established connection.

Per-tenant configuration in multi-tenant MCP servers

In multi-tenant deployments, some configuration values differ per tenant. The key rule is the same as for any per-tenant state: never store per-tenant config in module scope. Load it at initialize time and attach it to the tenant context:

// Multi-tenant config pattern
interface TenantConfig {
  tenant_id: string;
  max_results_per_call: number;   // different limits per pricing tier
  allowed_tools: Set<string>;     // tool access control
  external_api_endpoint: string;  // tenant-specific endpoint
}

// In the initialize handler, load tenant config from DB and store in session map
const tenantContexts = new Map<string, TenantConfig>();

mcpTransport.on('session', (session) => {
  session.on('initialize', async (params) => {
    const tenantId = extractTenantId(params); // from JWT or API key
    const tenantConfig = await loadTenantConfig(deps.db, tenantId);
    tenantContexts.set(session.id, tenantConfig);
  });
  session.on('close', () => {
    tenantContexts.delete(session.id); // always clean up to prevent map growth
  });
});

Store the tenant config in the session context map, not in module scope. The module-scope mistake — setting a global currentTenant variable in the initialize handler — causes a silent race: two concurrent sessions set currentTenant to different values, and one session's tool calls run with another tenant's config. The session context map (keyed by session ID) is the correct isolation boundary.

AliveMCP probe and configuration failures

AliveMCP probes your server with an initialize + tools/list request every minute. Configuration validation failures manifest as specific probe signals:

The fail-fast pattern — validate config, verify connectivity, then listen — means AliveMCP gets a clean binary signal: the server is either fully configured and ready, or it never starts. This is more useful than a server that starts with bad config and fails per-call.

Related questions

Should I use dotenv in production?

Use dotenv only in development and test environments. In production, inject environment variables through your deployment platform (Docker --env-file, Kubernetes ConfigMap/Secret, systemd EnvironmentFile, or a secrets manager). Checking process.env.NODE_ENV !== 'production' before calling dotenv.config() is the conventional guard. In production, the deployment platform is responsible for populating the environment — calling dotenv in production risks overwriting values the platform already set.

How do I handle configuration in tests?

Create a createTestConfig() function that returns a valid AppConfig object with test-appropriate values (an in-memory SQLite URL for DATABASE_URL, no REDIS_URL, a dummy 32-char string for API_SECRET). Pass it to createTestDeps(config). Never call parseConfig() in tests — tests run without a populated environment and will fail on the Zod validation for required production secrets. The config schema's optional() and default() modifiers should reflect whether a value is genuinely optional in production, not whether tests happen to not set it.

What is the right way to version configuration schemas?

Add new optional fields with .optional() or .default() to maintain backward compatibility — servers deployed without the new variable still start. Remove fields only after all running instances have been updated. When changing the type or validation of an existing field (e.g., making a string stricter), use a migration step: add a new field with the stricter type and a different name, deprecate the old field, remove it in a later version. This is the same schema-migration discipline you apply to database schemas.

Can I store configuration in the database instead of environment variables?

Yes, for mutable operational config that changes without a deployment (rate limits, feature flags, log verbosity). Store static bootstrapping config — the database URL itself, the server port, secrets — in environment variables because the database connection is not yet available when you need those values. A two-tier approach is common: environment variables for bootstrapping config, database or Redis for runtime operational config. Use the dynamic reload patterns above for the runtime tier.

Further reading