Guide · MCP Database Integration

MCP Server MongoDB — connecting and exposing MongoDB data as MCP tools

MongoDB is a common backing store for MCP servers — especially when tools need to query flexible, schema-less documents, embed unstructured data, or aggregate across large collections. This guide covers connecting to MongoDB from a TypeScript MCP server, designing CRUD tools around MongoDB operations, exposing collection entries as MCP resources, running aggregation pipelines as tools, and instrumenting a /health/mongodb endpoint so AliveMCP can detect connection failures before users do.

TL;DR

Use the native MongoDB driver (mongodb npm package), not Mongoose — Mongoose schema validation adds latency and memory overhead that doesn't benefit an MCP server's tool layer. Initialize a singleton MongoClient at startup with ServerApiVersion.v1 strict mode. Wrap each MongoDB operation in a Zod-validated tool handler, and never pass caller-controlled filter objects directly to MongoDB without explicit allow-listing of fields. For resources, serialize ObjectId using .toHexString(), not toJSON(). Wire a /health/mongodb endpoint that pings the database and reports connection pool stats so monitoring tools detect silent pool exhaustion before callers time out.

MongoDB driver setup: native driver vs Mongoose

MCP servers have different requirements than web application backends. There's no Mongoose-style ODM benefit here — you don't need schema enforcement at the ORM layer because Zod handles that at the tool parameter boundary. Mongoose's overhead (schema validation, middleware hooks, model compilation) adds 5–20ms per operation and increases cold-start memory. Use the native mongodb driver instead.

import { MongoClient, ServerApiVersion, Db } from 'mongodb';

const MONGO_URI = process.env.MONGODB_URI!;

// Singleton client — created once, reused across all tool calls
let _client: MongoClient | null = null;
let _db: Db | null = null;

export async function getDb(dbName = 'app'): Promise<Db> {
  if (_db) return _db;

  _client = new MongoClient(MONGO_URI, {
    serverApi: {
      version: ServerApiVersion.v1,
      strict: true,       // reject deprecated operators
      deprecationErrors: true
    },
    maxPoolSize: 10,      // tune to your Atlas tier's connection limit
    minPoolSize: 2,
    connectTimeoutMS: 5000,
    socketTimeoutMS: 30000,
    serverSelectionTimeoutMS: 5000
  });

  await _client.connect();
  _db = _client.db(dbName);
  return _db;
}

// TypeScript document interface — mirrors your collection's shape
export interface ProductDoc {
  _id?: import('mongodb').ObjectId;
  sku: string;
  name: string;
  price: number;
  status: 'active' | 'discontinued' | 'draft';
  tags: string[];
  createdAt: Date;
  updatedAt: Date;
}

export async function closeDb(): Promise<void> {
  await _client?.close();
  _client = null;
  _db = null;
}

Call getDb() in each tool handler rather than at module load time. This defers connection establishment until the first tool call, which keeps startup fast and lets the MCP server respond to initialize before the database connection is established.

Option	Native driver	Mongoose
Cold start overhead	~5 MB, ~50ms	~25 MB, ~200ms
Schema validation	None (use Zod at tool boundary)	Runtime schema validation on every save
Aggregation support	Full, typed with generics	Full, but verbose
Connection pool	Built-in, configurable	Wraps native driver
Middleware hooks	None	pre/post hooks (overhead you don't need)

CRUD tool patterns

The key safety rule for MongoDB tool parameters: never accept a raw filter object from the MCP caller and pass it directly to MongoDB. A caller could inject { "$where": "sleep(10000)" } or use $regex with catastrophic backtracking. Instead, define explicit Zod schemas for each allowed field and build the MongoDB filter programmatically.

import { z } from 'zod';
import { ObjectId, Filter } from 'mongodb';
import { McpError, ErrorCode } from '@modelcontextprotocol/sdk/types.js';
import { getDb, ProductDoc } from './db.js';

// ---- find_documents ----
server.tool(
  'find_documents',
  {
    // Explicit allow-list — not z.record(z.unknown())
    sku: z.string().optional(),
    status: z.enum(['active', 'discontinued', 'draft']).optional(),
    tag: z.string().optional(),
    limit: z.number().int().min(1).max(100).default(20),
    skip: z.number().int().min(0).default(0),
    projection: z.array(z.string()).optional()
  },
  async ({ sku, status, tag, limit, skip, projection }) => {
    const db = await getDb();
    const filter: Filter<ProductDoc> = {};
    if (sku) filter.sku = sku;
    if (status) filter.status = status;
    if (tag) filter.tags = tag; // MongoDB matches if array contains value

    const proj = projection
      ? Object.fromEntries(projection.map(f => [f, 1]))
      : undefined;

    const docs = await db.collection<ProductDoc>('products')
      .find(filter, { projection: proj })
      .skip(skip)
      .limit(limit)
      .toArray();

    return {
      content: [{
        type: 'text',
        text: JSON.stringify(docs.map(serializeDoc), null, 2)
      }]
    };
  }
);

// ---- insert_document ----
server.tool(
  'insert_document',
  {
    sku: z.string().min(1),
    name: z.string().min(1),
    price: z.number().positive(),
    status: z.enum(['active', 'discontinued', 'draft']).default('draft'),
    tags: z.array(z.string()).default([])
  },
  async (params) => {
    const db = await getDb();
    const now = new Date();
    const result = await db.collection<ProductDoc>('products').insertOne({
      ...params,
      createdAt: now,
      updatedAt: now
    });
    return {
      content: [{
        type: 'text',
        text: JSON.stringify({ insertedId: result.insertedId.toHexString() })
      }]
    };
  }
);

// ---- update_document ----
server.tool(
  'update_document',
  {
    id: z.string().length(24, 'Must be a 24-character hex ObjectId'),
    name: z.string().optional(),
    price: z.number().positive().optional(),
    status: z.enum(['active', 'discontinued', 'draft']).optional(),
    tags: z.array(z.string()).optional()
  },
  async ({ id, ...fields }) => {
    const db = await getDb();
    const setFields: Partial<ProductDoc> = { updatedAt: new Date() };
    if (fields.name !== undefined) setFields.name = fields.name;
    if (fields.price !== undefined) setFields.price = fields.price;
    if (fields.status !== undefined) setFields.status = fields.status;
    if (fields.tags !== undefined) setFields.tags = fields.tags;

    const result = await db.collection<ProductDoc>('products').updateOne(
      { _id: new ObjectId(id) },
      { $set: setFields }
    );
    if (result.matchedCount === 0) {
      throw new McpError(ErrorCode.InvalidParams, `No document with id ${id}`);
    }
    return {
      content: [{ type: 'text', text: JSON.stringify({ modifiedCount: result.modifiedCount }) }]
    };
  }
);

// ---- delete_document ----
server.tool(
  'delete_document',
  { id: z.string().length(24) },
  async ({ id }) => {
    const db = await getDb();
    const result = await db.collection<ProductDoc>('products')
      .deleteOne({ _id: new ObjectId(id) });
    return {
      content: [{ type: 'text', text: JSON.stringify({ deletedCount: result.deletedCount }) }]
    };
  }
);

The helper serializeDoc converts ObjectId and Date fields to strings before JSON serialization — covered in the Resources section below.

Aggregation pipeline as a tool

A generic run_aggregation tool that accepts a raw pipeline from the caller is powerful but dangerous. Any caller with access to the tool can run arbitrary read operations across the entire collection, including $lookup joins to other collections, $out to write results to a new collection, or $graphLookup queries that can OOM the server. Restrict this tool to trusted admin contexts only.

The safer alternative for most use cases is named aggregate tools with hardcoded pipelines:

// SAFE: named aggregate tool with hardcoded pipeline
server.tool(
  'count_products_by_status',
  { /* no parameters — pipeline is fixed */ },
  async () => {
    const db = await getDb();
    const result = await db.collection<ProductDoc>('products').aggregate([
      { $group: { _id: '$status', count: { $sum: 1 } } },
      { $sort: { count: -1 } }
    ]).toArray();
    return {
      content: [{ type: 'text', text: JSON.stringify(result, null, 2) }]
    };
  }
);

server.tool(
  'revenue_by_tag',
  {
    start_date: z.string().datetime(),
    end_date: z.string().datetime()
  },
  async ({ start_date, end_date }) => {
    const db = await getDb();
    const result = await db.collection('orders').aggregate([
      {
        $match: {
          createdAt: {
            $gte: new Date(start_date),
            $lte: new Date(end_date)
          },
          status: 'completed'
        }
      },
      { $unwind: '$items' },
      {
        $lookup: {
          from: 'products',
          localField: 'items.sku',
          foreignField: 'sku',
          as: 'product'
        }
      },
      { $unwind: '$product' },
      { $unwind: '$product.tags' },
      {
        $group: {
          _id: '$product.tags',
          totalRevenue: { $sum: { $multiply: ['$items.price', '$items.qty'] } }
        }
      },
      { $sort: { totalRevenue: -1 } },
      { $limit: 20 }
    ]).toArray();

    return {
      content: [{ type: 'text', text: JSON.stringify(result, null, 2) }]
    };
  }
);

// DANGEROUS: only expose in admin-scoped MCP servers
server.tool(
  'run_aggregation',
  {
    collection: z.enum(['products', 'orders', 'customers']), // explicit allow-list
    pipeline: z.array(z.record(z.unknown())).max(10)         // cap pipeline depth
  },
  async ({ collection, pipeline }) => {
    // Reject dangerous stages
    const dangerousStages = ['$out', '$merge', '$indexStats', '$currentOp'];
    for (const stage of pipeline) {
      const key = Object.keys(stage)[0];
      if (dangerousStages.includes(key)) {
        throw new McpError(ErrorCode.InvalidParams, `Stage ${key} is not permitted`);
      }
    }
    const db = await getDb();
    const result = await db.collection(collection).aggregate(pipeline).toArray();
    return {
      content: [{ type: 'text', text: JSON.stringify(result, null, 2) }]
    };
  }
);

Exposing collections as MCP resources

MCP resources let clients browse and read data without calling a tool. For MongoDB, a natural resource hierarchy is: collections as the resource list, and individual documents as resources identified by a URI. The trickiest part is ObjectId serialization — ObjectId.toJSON() produces a plain string in some contexts and an object in others depending on the MongoDB driver version. Use .toHexString() explicitly.

import {
  ListResourcesRequestSchema,
  ReadResourceRequestSchema
} from '@modelcontextprotocol/sdk/types.js';
import { ObjectId } from 'mongodb';

// Serialize a MongoDB document for MCP content
function serializeDoc(doc: Record<string, unknown>): Record<string, unknown> {
  return JSON.parse(JSON.stringify(doc, (_key, value) => {
    if (value instanceof ObjectId) return value.toHexString();
    if (value instanceof Date) return value.toISOString();
    return value;
  }));
}

server.setRequestHandler(ListResourcesRequestSchema, async () => {
  const db = await getDb();
  const collections = await db.listCollections().toArray();

  const resources = await Promise.all(
    collections.map(async (col) => {
      const count = await db.collection(col.name).estimatedDocumentCount();
      return {
        uri: `mongodb://${col.name}`,
        name: col.name,
        description: `${count.toLocaleString()} documents`,
        mimeType: 'application/json'
      };
    })
  );

  return { resources };
});

server.setRequestHandler(ReadResourceRequestSchema, async ({ params }) => {
  // URI scheme: mongodb://{collection}/{objectId}
  const match = params.uri.match(/^mongodb:\/\/([^/]+)\/([a-f0-9]{24})$/);
  if (!match) {
    throw new McpError(ErrorCode.InvalidParams, `Unrecognised URI: ${params.uri}`);
  }
  const [, collection, hexId] = match;

  const db = await getDb();
  const doc = await db.collection(collection).findOne({ _id: new ObjectId(hexId) });
  if (!doc) {
    throw new McpError(ErrorCode.InvalidParams, `Document ${hexId} not found in ${collection}`);
  }

  return {
    contents: [{
      uri: params.uri,
      mimeType: 'application/json',
      text: JSON.stringify(serializeDoc(doc as Record<string, unknown>), null, 2)
    }]
  };
});

One important caveat: estimatedDocumentCount() uses collection metadata and is fast (constant time), but it can return stale counts after an unclean shutdown. Use countDocuments() if accuracy is critical, at the cost of a full collection scan for large collections.

Health endpoint: /health/mongodb

The /health/mongodb endpoint should verify that the connection is live, the server is reachable, and the connection pool is healthy. A ping command tests the round-trip; listCollections confirms database-level access.

import express from 'express';

const app = express();

app.get('/health/mongodb', async (_req, res) => {
  const start = Date.now();
  try {
    const db = await getDb();

    // 1. Ping — fastest liveness check
    await db.command({ ping: 1 });
    const pingMs = Date.now() - start;

    // 2. List collections — confirms DB-level access
    const collections = await db.listCollections().toArray();

    // 3. Connection pool stats (native driver exposes these)
    const client = db.client as import('mongodb').MongoClient;
    // @ts-ignore — internal property, present on native driver
    const pool = (client as unknown as { topology?: { s?: { pool?: {
      totalConnectionCount: number;
      availableConnectionCount: number;
      pendingConnectionCount: number;
    } } } }).topology?.s?.pool;

    res.json({
      status: 'ok',
      ping_ms: pingMs,
      collections: collections.length,
      pool: pool ? {
        total: pool.totalConnectionCount,
        available: pool.availableConnectionCount,
        pending: pool.pendingConnectionCount
      } : null
    });
  } catch (err) {
    res.status(503).json({
      status: 'error',
      error: (err as Error).message,
      elapsed_ms: Date.now() - start
    });
  }
});

Wire AliveMCP to this endpoint with a 30-second check interval. A 503 response triggers an immediate alert. A response where pool.available === 0 and pool.pending > 0 indicates pool exhaustion — all connections are busy and new operations are queuing. This is a precursor to timeout errors that may not yet surface as a 503.

Silent failure modes and AliveMCP monitoring

MongoDB has several failure modes that don't surface as HTTP errors on the MCP server process, meaning a naive HTTP check returns 200 while callers experience errors.

Failure mode	What happens	Does HTTP check catch it?	How to detect
Connection pool exhaustion	All 100 connections busy. New operations queue, then time out after `socketTimeoutMS`. Tool calls return timeout errors.	No — MCP server process is running fine	`/health/mongodb` reporting `pool.available === 0`
Replica set failover	Primary election takes 10–30 seconds. Writes fail with `NotPrimaryError`. Reads on secondary (if allowed) continue.	No — ping may succeed against secondary	Log `NotPrimaryError` occurrences; monitor write error rate
Cursor timeout	Long aggregations hit `serverCursorTimeoutMS` (default 10 min). Tool returns a timeout error, but connection is fine.	No — connection is healthy	Track tool-level error rate; log cursor timeout errors specifically
Atlas M0 connection limit	Atlas free tier limits to 500 total connections. If multiple MCP server instances are running, pool × instances can exceed this.	No — individual connections may succeed	Monitor Atlas metrics for connection count; set `maxPoolSize` low (5–10 on free tier)

AliveMCP monitors your /health/mongodb endpoint and alerts you the moment connection pool exhaustion or replica failover affects tool availability — before callers start filing bug reports.

Frequently asked questions

How do I accept an ObjectId in a tool parameter?

Accept it as a 24-character hex string and convert with new ObjectId(id) inside the handler. Validate with z.string().length(24).regex(/^[a-f0-9]{24}$/i) in the Zod schema. Do not ask callers to pass raw ObjectId objects — MCP tool parameters are JSON, so they're always strings at the transport layer. Catch BSONError from new ObjectId() and re-throw as an McpError with ErrorCode.InvalidParams so the caller gets a useful message instead of an unhandled exception.

Can I pass index hints in find_documents?

Yes, but treat hints as an advanced parameter with caution. Add an optional hint parameter (an index name string or key pattern object) that maps to the .hint() cursor method. Restrict it to an allow-list of known index names to prevent callers from specifying arbitrary hint objects that could force inefficient plans. Example: hint: z.enum(['sku_1', 'status_1_createdAt_-1']).optional(). Wrong hints cause full collection scans that look like slow queries, not errors, so they won't surface in error rate monitors — add a explain_ms field to your tool response when NODE_ENV === 'development'.

What's different between an Atlas and a self-hosted connection string?

Atlas connection strings use the mongodb+srv:// scheme, which performs a DNS SRV lookup to discover replica set members. Self-hosted connection strings typically use mongodb:// with explicit host:port pairs. The main MCP server implication is that mongodb+srv:// requires DNS resolution at connection time — if your MCP server is in a network where DNS is slow or restricted, the initial MongoClient.connect() can hang. Set serverSelectionTimeoutMS: 5000 to bound this. Also, Atlas enforces TLS; self-hosted may not — match tls=true in the URI to your deployment.

Can MCP tools run MongoDB transactions?

Yes. MongoDB supports multi-document ACID transactions on replica sets (Atlas clusters always qualify). Use a session and transaction: const session = client.startSession(); await session.withTransaction(async () => { await col.insertOne(doc, { session }); await col.updateOne(filter, update, { session }); }). Wrap the whole thing in a try/catch inside the tool handler and re-throw as McpError on failure. Keep transactions short — long-running transactions hold locks and increase replica set oplog pressure. If a transaction spans multiple tool calls (rather than a single tool call), you need external session state management, which is complex and usually not worth it.