Guide · MCP Tool Implementation
MCP server database tools
Database tools are among the highest-value MCP capabilities — they let LLMs query real data, explore schemas, and generate reports without building a custom UI for every question. They also carry serious risks: SQL injection, accidental destructive queries, row-set floods that overflow LLM context, and unindexed queries that table-scan production databases. This guide covers how to build query_database, list_tables, and describe_table tools with injection prevention, read-only connection pools, row limits, schema introspection resources, and query cost guards.
TL;DR
Never interpolate LLM-provided values directly into SQL strings. Always use parameterized queries (prepared statements) for any value coming from a tool argument. Connect with a read-only database user for query tools — even if the query is SELECT, a read-write connection means a prompt injection that smuggles a semicolon-terminated DROP TABLE can execute. Set a hard row limit on every SELECT and wrap in a statement timeout. Expose the schema via MCP resources so the LLM can understand table structure before generating queries.
SQL injection: the mandatory defense
SQL injection in an MCP tool is more likely than in a traditional web app — the LLM is the "user" providing query values, and it may generate values containing SQL syntax through hallucination, instruction following, or prompt injection from retrieved data. The defense is identical to web applications: never interpolate values into SQL strings:
// BAD — interpolating user values directly into SQL
const bad = `SELECT * FROM users WHERE email = '${email}'`;
// If email = "' OR '1'='1' --", this becomes a full table dump
// GOOD — parameterized query (PostgreSQL example with pg/postgres.js)
const good = await sql`SELECT * FROM users WHERE email = ${email}`;
// The driver sends the value as a separate parameter — the DB treats it as a literal, not SQL
Every popular Node.js database client supports parameterized queries. The syntax varies by library but the security guarantee is the same:
| Library | Parameterization syntax |
|---|---|
| postgres.js | sql`SELECT * FROM t WHERE id = ${id}` |
| pg (node-postgres) | client.query('SELECT * FROM t WHERE id = $1', [id]) |
| better-sqlite3 | db.prepare('SELECT * FROM t WHERE id = ?').get(id) |
| Prisma | prisma.t.findUnique({ where: { id } }) |
| Drizzle ORM | db.select().from(t).where(eq(t.id, id)) |
Building the query_database tool
For a general-purpose SQL query tool — where the LLM writes the full SQL statement — the safety model is: read-only database user, statement timeout, and hard row limit injected by the server regardless of what the LLM sends:
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';
import postgres from 'postgres'; // npm install postgres
const server = new McpServer({ name: 'database-server', version: '1.0.0' });
// Read-only connection pool — even if the LLM sends a DROP TABLE, the DB user lacks permission
const readOnlySql = postgres(process.env.DATABASE_URL_READONLY!, {
max: 5,
idle_timeout: 30,
connect_timeout: 10,
});
const MAX_ROWS = 500;
const STATEMENT_TIMEOUT_MS = 10_000; // 10 seconds
// Allow only SELECT and WITH (CTEs) at the statement level
function isSelectStatement(sql: string): boolean {
const trimmed = sql.trim().toUpperCase();
return trimmed.startsWith('SELECT') || trimmed.startsWith('WITH') || trimmed.startsWith('EXPLAIN');
}
server.tool(
'query_database',
'Execute a read-only SQL query and return results as formatted text',
{
sql: z.string().max(10_000).describe('SQL SELECT query to execute'),
max_rows: z.number().int().min(1).max(MAX_ROWS).default(50).describe('Maximum rows to return'),
format: z.enum(['table', 'json', 'csv']).default('table'),
},
async ({ sql: rawSql, max_rows, format }) => {
if (!isSelectStatement(rawSql)) {
return { isError: true, content: [{ type: 'text', text: 'Only SELECT queries are allowed. Mutations are not permitted.' }] };
}
try {
// Wrap in a CTE that adds LIMIT — prevents full table scans for paginated results
const limitedSql = `
SET LOCAL statement_timeout = '${STATEMENT_TIMEOUT_MS}ms';
SELECT * FROM (${rawSql}) _q LIMIT ${max_rows + 1}
`;
const rows = await readOnlySql.unsafe(limitedSql);
const hasMore = rows.length > max_rows;
const displayRows = rows.slice(0, max_rows);
let text: string;
if (format === 'json') {
text = JSON.stringify(displayRows, null, 2);
} else if (format === 'csv') {
const headers = Object.keys(displayRows[0] ?? {});
text = [headers.join(','), ...displayRows.map(r => headers.map(h => JSON.stringify(r[h] ?? '')).join(','))].join('\n');
} else {
// ASCII table format
const headers = Object.keys(displayRows[0] ?? {});
const widths = headers.map(h => Math.max(h.length, ...displayRows.map(r => String(r[h] ?? '').length)));
const row2str = (r: Record<string, unknown>) => '| ' + headers.map((h, i) => String(r[h] ?? '').padEnd(widths[i])).join(' | ') + ' |';
const divider = '+' + widths.map(w => '-'.repeat(w + 2)).join('+') + '+';
const header = '| ' + headers.map((h, i) => h.padEnd(widths[i])).join(' | ') + ' |';
text = [divider, header, divider, ...displayRows.map(row2str), divider].join('\n');
}
const footer = hasMore ? `\n\n(showing first ${max_rows} of ${max_rows}+ rows — use OFFSET or WHERE to paginate)` : `\n\n(${displayRows.length} row${displayRows.length !== 1 ? 's' : ''})`;
return { content: [{ type: 'text', text: text + footer }] };
} catch (e) {
const msg = (e as Error).message;
if (msg.includes('statement_timeout')) return { isError: true, content: [{ type: 'text', text: `Query timed out after ${STATEMENT_TIMEOUT_MS / 1000}s. Add a WHERE clause or use LIMIT to reduce the result set.` }] };
return { isError: true, content: [{ type: 'text', text: `Query failed: ${msg}` }] };
}
}
);
The LIMIT max_rows + 1 trick fetches one extra row to detect whether more rows exist, without fetching them all. When the result set exceeds max_rows, show a pagination hint in the footer so the LLM knows to paginate rather than assuming the result is complete.
Schema introspection as MCP resources
An LLM that can query your database also needs to know the schema — table names, column names, types, and relationships. Expose these as MCP resources rather than requiring a tool call for each table:
import { ResourceTemplate } from '@modelcontextprotocol/sdk/server/mcp.js';
// Database-wide schema overview as a single resource
server.resource(
'db-schema',
'db://schema/overview',
{ name: 'Database Schema', description: 'All tables, columns, and types', mimeType: 'application/json' },
async (uri) => {
const tables = await readOnlySql`
SELECT table_name, column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_schema = 'public'
ORDER BY table_name, ordinal_position
`;
// Group by table
const schema: Record<string, unknown[]> = {};
for (const row of tables) {
if (!schema[row.table_name]) schema[row.table_name] = [];
schema[row.table_name].push({
column: row.column_name,
type: row.data_type,
nullable: row.is_nullable === 'YES',
default: row.column_default,
});
}
return { contents: [{ uri: uri.href, mimeType: 'application/json', text: JSON.stringify(schema, null, 2) }] };
}
);
// Per-table schema as a template resource
server.resource(
'table-schema',
new ResourceTemplate('db://schema/tables/{tableName}', { list: undefined }),
{ name: 'Table Schema', mimeType: 'application/json' },
async (uri, { tableName }) => {
const cols = await readOnlySql`
SELECT column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_schema = 'public' AND table_name = ${tableName}
ORDER BY ordinal_position
`;
if (cols.length === 0) throw new Error(`Table not found: ${tableName}`);
return { contents: [{ uri: uri.href, mimeType: 'application/json', text: JSON.stringify(cols, null, 2) }] };
}
);
List and describe tools for exploration
Expose simple discovery tools so the LLM can explore the database structure before writing queries:
server.tool(
'list_tables',
'List all available tables with row counts',
{},
async () => {
const tables = await readOnlySql`
SELECT
t.table_name,
pg_size_pretty(pg_total_relation_size(quote_ident(t.table_name)::regclass)) AS size,
(SELECT reltuples::bigint FROM pg_class WHERE relname = t.table_name) AS approx_rows
FROM information_schema.tables t
WHERE t.table_schema = 'public' AND t.table_type = 'BASE TABLE'
ORDER BY t.table_name
`;
const text = tables.map(t => `${t.table_name.padEnd(40)} ${String(t.approx_rows ?? 0).padStart(10)} rows ${t.size}`).join('\n');
return { content: [{ type: 'text', text: text || '(no tables found)' }] };
}
);
server.tool(
'describe_table',
'Show column definitions and sample data for a table',
{ table_name: z.string().describe('Table name to describe'), sample_rows: z.number().int().min(0).max(10).default(3) },
async ({ table_name, sample_rows }) => {
// Validate table name exists to prevent schema disclosure via error messages
const exists = await readOnlySql`
SELECT 1 FROM information_schema.tables
WHERE table_schema = 'public' AND table_name = ${table_name}
`;
if (exists.length === 0) return { isError: true, content: [{ type: 'text', text: `Table not found: ${table_name}` }] };
const cols = await readOnlySql`
SELECT column_name, data_type, is_nullable FROM information_schema.columns
WHERE table_schema = 'public' AND table_name = ${table_name} ORDER BY ordinal_position
`;
const colText = cols.map(c => ` ${c.column_name}: ${c.data_type}${c.is_nullable === 'NO' ? ' NOT NULL' : ''}`).join('\n');
let sampleText = '';
if (sample_rows > 0) {
// Use sql.unsafe with a hardened table name (verified to exist above)
const safe_table = table_name.replace(/[^a-zA-Z0-9_]/g, '');
const sample = await readOnlySql.unsafe(`SELECT * FROM "${safe_table}" LIMIT ${sample_rows}`);
sampleText = '\n\nSample rows:\n' + JSON.stringify(sample, null, 2);
}
return { content: [{ type: 'text', text: `Table: ${table_name}\nColumns:\n${colText}${sampleText}` }] };
}
);
In describe_table, the table name is validated against information_schema.tables before use in sql.unsafe. This prevents SQL injection via the table name argument while still allowing dynamic table name interpolation (which parameterized queries don't support — parameter binding is for values, not identifiers).
Preventing expensive queries: EXPLAIN guard
An LLM querying an indexed column is fast. An LLM querying an unindexed column on a 50M-row table is a production incident. Add an optional cost check using EXPLAIN before running expensive-looking queries:
async function estimateQueryCost(sql: string): Promise<number> {
const plan = await readOnlySql.unsafe(`EXPLAIN (FORMAT JSON) ${sql}`);
const totalCost = plan[0]?.['QUERY PLAN']?.[0]?.Plan?.['Total Cost'] ?? 0;
return Number(totalCost);
}
// In query_database handler, before executing:
const cost = await estimateQueryCost(rawSql);
const MAX_QUERY_COST = 100_000; // PostgreSQL planner cost units — tune for your schema
if (cost > MAX_QUERY_COST) {
return {
isError: true,
content: [{ type: 'text', text: `Query estimated cost ${cost.toFixed(0)} exceeds limit ${MAX_QUERY_COST}. Add an index, a WHERE clause, or a LIMIT to reduce cost.` }]
};
}
PostgreSQL planner cost units don't directly map to wall-clock time, but they scale linearly with row scans — a cost of 100,000 typically indicates a full scan of a large table without an index. Tune MAX_QUERY_COST against your schema by running EXPLAIN on representative queries.
Monitoring database MCP servers
Database tools fail in distinct modes that are invisible to transport-level health checks. A database connection pool exhaustion makes new queries queue indefinitely while tools/list still responds normally. A read-only user password rotation causes all queries to fail with FATAL: password authentication failed while initialize succeeds. A replication lag that causes read replica staleness returns stale data without any error signal.
Wire a health endpoint that runs a trivial query (SELECT 1) on the same read-only connection pool your tools use. Return the pool's idleCount and waitingCount as health metadata. AliveMCP probes your MCP endpoint using the full protocol handshake, catching transport-level and handler-level failures before users encounter broken data queries in their AI workflows.
Further reading
- MCP server SQLite — embedded database for single-file MCP stores
- MCP server Prisma — type-safe queries with ORM
- MCP server Redis — caching and fast key-value MCP tools
- MCP resources API — expose schema as readable context resources
- MCP server connection pooling — managing database connections
- MCP server error handling — database errors as isError responses
- MCP server authentication — protecting database access
- AliveMCP — uptime monitoring for HTTP-deployed MCP servers