Guide · Data Persistence

MCP server Redis

Redis adds value to an MCP server in three distinct ways. Tool response caching: LLMs frequently call the same tool with the same arguments across sessions — "get_user(id=123)" called by session A and session B five seconds apart should not make two external API requests. A Redis cache with a short TTL collapses repeated calls. Per-session rate limiting: an LLM in a reasoning loop can call a tool dozens of times in a minute; a sliding-window counter in Redis enforces a per-session cap that protects downstream APIs without blocking unrelated sessions. Distributed locks: when multiple connection-pooled workers handle tool calls concurrently, a Redis lock prevents duplicate singleton operations (email send, payment charge, file rename) from racing. This guide covers ioredis vs. node-redis, the cache-aside pattern, rate limiting, distributed locks, and graceful shutdown ordering.

TL;DR

Use ioredis for its automatic reconnect and cluster support. Implement cache-aside in a withCache() wrapper to keep tool handlers free of caching logic. Rate limit with a Lua script atomic compare-and-expire. On SIGTERM, call redis.quit() after sessions drain — redis.disconnect() drops the connection immediately and may leave in-flight commands unacknowledged.

Client choice: ioredis vs. node-redis

Feature	ioredis	node-redis (v4+)
Automatic reconnect	Built-in with exponential backoff	Manual reconnect strategy required
Cluster support	First-class (`new Redis.Cluster()`)	Supported but less ergonomic
Promises API	All commands return promises	All commands return promises
Lua scripting	`redis.defineCommand()`	`redis.createScript()`
Streams	Full XREAD/XADD/consumer group	Full support
Bundle size	Larger	Smaller (modular)

For most MCP servers on a single Redis instance, both work equally well. ioredis is recommended because its built-in reconnect with exponential backoff handles transient Redis restarts (patching, failover) without application code changes — in an MCP server, a Redis restart without reconnect logic causes all subsequent tool cache misses to throw rather than fall through to the underlying data source.

npm install ioredis
npm install --save-dev @types/ioredis  # if using older ioredis v4; v5+ ships its own types

Redis client singleton

// src/redis.ts
import Redis from 'ioredis';

const redis = new Redis({
  host: process.env.REDIS_HOST ?? 'localhost',
  port: Number(process.env.REDIS_PORT ?? 6379),
  password: process.env.REDIS_PASSWORD,
  // Reconnect on failure: exponential backoff 50ms → 2000ms, max 10 retries
  retryStrategy: (times) => Math.min(times * 50, 2000),
  maxRetriesPerRequest: 3,
  // Lazy connect: do not open the connection until the first command
  lazyConnect: false,
  // Key prefix: isolates this MCP server's keys from others sharing the Redis instance
  keyPrefix: process.env.REDIS_KEY_PREFIX ?? 'mcp:',
  enableOfflineQueue: true,
});

redis.on('error', (err) => {
  // Log but do not crash — the MCP server continues without cache on Redis failure
  console.error('Redis error:', err.message);
});

redis.on('reconnecting', () => {
  console.log('Redis reconnecting...');
});

export default redis;

The enableOfflineQueue: true setting buffers commands issued while Redis is reconnecting and replays them after reconnection. For cache operations, this is acceptable. For rate-limit counters, it means commands queued during a Redis outage are processed in a burst after reconnection — decide whether to disable the queue for rate-limit commands specifically (use a separate Redis client instance with enableOfflineQueue: false).

Tool response cache — cache-aside pattern

// src/cache.ts
import redis from './redis.js';

/**
 * Generic cache-aside wrapper for MCP tool handlers.
 * On cache hit: returns cached value without calling fn.
 * On cache miss: calls fn, stores the result, returns it.
 * On Redis error: calls fn directly (cache is best-effort).
 */
export async function withCache(
  key: string,
  ttlSeconds: number,
  fn: () => Promise
): Promise {
  try {
    const cached = await redis.get(key);
    if (cached !== null) {
      return JSON.parse(cached) as T;
    }
  } catch {
    // Redis unavailable — fall through to the real data source
  }

  const result = await fn();

  try {
    await redis.setex(key, ttlSeconds, JSON.stringify(result));
  } catch {
    // Cache write failure is non-fatal
  }

  return result;
}

// Usage in a tool handler:
// const user = await withCache(
//   `user:${userId}`,
//   300,  // 5-minute TTL
//   () => prisma.user.findUniqueOrThrow({ where: { id: userId } })
// );

The wrapper never throws from Redis failures — caching is a performance optimisation, not a correctness requirement. An MCP server that degrades to uncached operation during a Redis outage is far better than one that returns isError: true for every tool call because the cache is unavailable.

Per-session rate limiting

LLMs in autonomous reasoning loops can call the same tool dozens of times per minute. A sliding-window rate limiter in Redis enforces a per-session cap. The Lua script below is atomic — it reads and writes the counter in a single Redis roundtrip, preventing race conditions between concurrent tool calls within the same session.

// src/rate-limit.ts
import redis from './redis.js';

// Lua script: atomic sliding window using a sorted set
// KEYS[1]: the rate limit key (e.g., "ratelimit:session_abc:fetch_user")
// ARGV[1]: current timestamp in milliseconds
// ARGV[2]: window size in milliseconds
// ARGV[3]: max requests per window
// ARGV[4]: TTL for the key in seconds
// Returns: 1 if allowed, 0 if rate limited
const RATE_LIMIT_SCRIPT = `
  local key = KEYS[1]
  local now = tonumber(ARGV[1])
  local window = tonumber(ARGV[2])
  local limit = tonumber(ARGV[3])
  local ttl = tonumber(ARGV[4])
  local window_start = now - window

  -- Remove entries outside the current window
  redis.call('ZREMRANGEBYSCORE', key, '-inf', window_start)

  -- Count entries within the window
  local count = redis.call('ZCARD', key)

  if count >= limit then
    return 0
  end

  -- Add current request with timestamp as score
  redis.call('ZADD', key, now, now .. ':' .. math.random(1000000))
  redis.call('EXPIRE', key, ttl)
  return 1
`;

export async function checkRateLimit(
  sessionId: string,
  toolName: string,
  maxPerMinute = 30
): Promise {
  const key = `ratelimit:${sessionId}:${toolName}`;
  const now = Date.now();
  const windowMs = 60 * 1000;
  const ttlSeconds = 120;

  const result = await redis.eval(
    RATE_LIMIT_SCRIPT, 1, key,
    now, windowMs, maxPerMinute, ttlSeconds
  ) as number;

  return result === 1;
}

// In a tool handler:
// const allowed = await checkRateLimit(session.id, 'send_email', 5);
// if (!allowed) return { isError: true, content: [{ type: 'text', text: 'Rate limit exceeded: max 5 send_email calls per minute' }] };

Distributed locks for idempotent operations

A distributed lock (Redlock pattern) prevents two concurrent tool calls from performing the same irreversible operation. The lock acquires an exclusive key in Redis with a TTL; if the lock is already held, the second caller either waits or returns a "try again" error to the LLM.

// src/lock.ts
import redis from './redis.js';
import crypto from 'crypto';

export async function withLock(
  resource: string,
  ttlMs: number,
  fn: () => Promise
): Promise {
  const lockKey = `lock:${resource}`;
  const lockValue = crypto.randomBytes(16).toString('hex');

  // SET key value NX PX ttl — atomic acquire; returns OK or null
  const acquired = await redis.set(lockKey, lockValue, 'PX', ttlMs, 'NX');
  if (!acquired) return null;  // Lock held by another caller

  try {
    return await fn();
  } finally {
    // Lua: only release the lock if we still own it (lockValue matches)
    const releaseScript = `
      if redis.call('GET', KEYS[1]) == ARGV[1] then
        return redis.call('DEL', KEYS[1])
      else
        return 0
      end
    `;
    await redis.eval(releaseScript, 1, lockKey, lockValue);
  }
}

// Usage:
// const result = await withLock('send-email:user-123', 30_000, () => sendEmail(userId));
// if (result === null) return { isError: true, content: [{ type: 'text', text: 'Operation in progress — try again in a moment' }] };

Graceful shutdown — quit vs. disconnect

process.on('SIGTERM', async () => {
  serverState = 'draining';
  httpServer.close();

  // Wait for active sessions to finish
  const drainStart = Date.now();
  while (activeSessions.size > 0 && Date.now() - drainStart < DRAIN_TIMEOUT_MS) {
    await new Promise(resolve => setTimeout(resolve, 100));
  }

  // redis.quit() sends QUIT command and waits for acknowledgement —
  // in-flight commands are completed before the connection closes.
  // redis.disconnect() closes the socket immediately — in-flight commands are lost.
  await redis.quit();

  process.exit(0);
});

AliveMCP and Redis health

A Redis failure that causes cache misses does not affect MCP protocol correctness if the cache-aside wrapper falls through correctly. But a Redis failure that blocks the event loop (a Redis command that never resolves because the connection is dead and enableOfflineQueue is filling up) does affect tool call latency, which AliveMCP detects as elevated probe response times before timeouts. The MCP server health check endpoint should include a Redis PING check alongside the database check — AliveMCP probes /health to distinguish Redis degradation from full server failure.