Guide · Testing

MCP server test coverage

Test coverage measures which lines, branches, and functions in your source code were executed during the test suite. It does not measure whether the tests are meaningful — 100% coverage with useless assertions is possible — but it does catch gaps: a branch in a tool handler that has never been exercised is likely untested behavior. For MCP servers, the most valuable coverage target is branch coverage on tool handler logic — every early return, every error path, every conditional transformation. Startup and shutdown sequences are harder to cover fully and warrant lower thresholds. This guide covers coverage configuration with @vitest/coverage-v8, how to read the output, where to set thresholds, and what coverage cannot tell you.

TL;DR

Install @vitest/coverage-v8. Add coverage.include: ['src/**/*.ts'] in vitest.config.ts to surface files with zero tests. Set thresholds at 80% lines / 70% branches for MCP servers as a starting point. Run vitest run --coverage in CI and upload the coverage/lcov.info artifact. Coverage above 90% on tool handler files is achievable and worth targeting.

Coverage providers: V8 vs. Istanbul

Vitest supports two coverage providers. Both report line, branch, function, and statement coverage, but differ in how they instrument code.

Provider	Package	How it works	Accuracy	Speed
V8 (C8)	`@vitest/coverage-v8`	Uses Node.js's built-in V8 coverage — no source transformation	Very accurate for TypeScript compiled to JS; may miss some branches in type-narrowing code	Fast — no transform step
Istanbul	`@vitest/coverage-istanbul`	Instruments the source with counters via Babel transform	More accurate for complex conditional types; slower	Slower — transforms every file

For MCP servers written in TypeScript, @vitest/coverage-v8 is the right choice. It requires no extra configuration, works with the same esbuild transform that Vitest uses for tests, and is consistently faster than Istanbul. Use Istanbul if you encounter V8 inaccuracies in complex conditional logic.

npm install --save-dev @vitest/coverage-v8

Configuration

// vitest.config.ts
import { defineConfig } from 'vitest/config';

export default defineConfig({
  test: {
    environment: 'node',
    coverage: {
      provider: 'v8',

      // Report formats: text (terminal), html (browser), lcov (CI upload)
      reporter: ['text', 'html', 'lcov'],

      // Include ALL source files — without this, only files imported by tests
      // appear in the report; files with no tests at all are hidden
      include: ['src/**/*.ts'],

      // Exclude test files, type declaration files, and generated code
      exclude: [
        'src/**/*.test.ts',
        'src/**/*.spec.ts',
        'src/**/*.d.ts',
        'src/generated/**',
      ],

      // Fail CI if coverage drops below these thresholds.
      // The 'per-file' variant fails on any individual file that drops below.
      thresholds: {
        lines: 80,
        branches: 70,
        functions: 80,
        statements: 80,
        // Optionally enforce tighter coverage on the most important files:
        // 'src/tools/**': { branches: 90, lines: 90 },
      },

      // Include all matched files in the report even if no tests import them
      all: true,
    },
  },
});

The most important setting is all: true (or equivalently, setting include patterns). Without it, a file that no test ever imports gets a reported coverage of undefined — it disappears from the report as if it doesn't exist. A file with 0% coverage is worse than a file with 50% coverage, and hiding it is worse than showing it.

Reading the coverage output

Running vitest run --coverage prints a table to the terminal:

----------|---------|----------|---------|---------|
File      | % Stmts | % Branch | % Funcs | % Lines |
----------|---------|----------|---------|---------|
src/      |   87.50 |    75.00 |   88.89 |   87.50 |
 server.ts|   91.30 |    83.33 |  100.00 |   91.30 |
 tools/   |         |          |         |         |
  weather |  100.00 |   100.00 |  100.00 |  100.00 |
  users.ts|   80.00 |    60.00 |   75.00 |   80.00 |
 db.ts    |   70.00 |    50.00 |   66.67 |   70.00 |
----------|---------|----------|---------|---------|

The columns to focus on for MCP servers:

% Branch — most important for tool handlers. A branch is any conditional: if/else, ternary, ??, optional chaining. Uncovered branches are untested behavior that can fail silently in production.
% Lines — a proxy for overall coverage. Low line coverage in a file usually means an entire code path is untested.
% Funcs — if a function is at 0%, no test ever calls it. May indicate dead code or a critical path with no test.

The HTML report (coverage/index.html) shows which specific lines and branches are uncovered — open it in a browser to see which branches in users.ts are at 60%.

Coverage targets by file type

Not all MCP server code is equally testable. Setting a single global threshold at 90% causes frustration when startup and shutdown code — which requires stopping real servers and simulating signals — drags down the aggregate. Differentiate thresholds by file type.

Code area	Recommended branch coverage target	Why
Tool handler logic (`src/tools/`)	90%+	Every conditional in a tool handler is a user-facing behavior path; all should be tested
Input validation (`src/validation/`)	90%+	Validation branches define what errors users see; cover all error cases
Database helpers (`src/db/`)	70–80%	Some paths only trigger on DB errors that require real infrastructure to reproduce
Server setup (`src/server.ts`)	60–70%	Startup errors and shutdown drain are hard to test in a unit context
Entry point (`src/index.ts`)	20–40%	The top-level `main()` function that binds ports and starts the server is integration-tested, not unit-tested

Vitest supports per-file or per-directory thresholds using glob patterns in vitest.config.ts. Use this to enforce higher coverage on your tool handler directory without requiring the same from startup boilerplate.

Schema snapshot testing

Coverage metrics don't catch a different category of regression: unintentional schema changes. If you rename a tool or add a required parameter, existing LLM integrations break — no test fails, but coverage remains the same. Snapshot testing fills this gap.

// src/tools.snapshot.test.ts
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';
import { createServer } from './server.js';

describe('tool schema snapshot', () => {
  let client: Client;

  beforeEach(async () => {
    const [serverTransport, clientTransport] = InMemoryTransport.createLinkedPair();
    await createServer(fakeDeps).connect(serverTransport);
    client = new Client({ name: 'snapshot-client', version: '1.0.0' }, { capabilities: {} });
    await client.connect(clientTransport);
  });

  afterEach(() => client.close());

  it('tool schemas match the committed snapshot', async () => {
    const { tools } = await client.listTools();
    // Sort for stable comparison across Node.js versions
    const sorted = tools.sort((a, b) => a.name.localeCompare(b.name));
    expect(sorted).toMatchSnapshot();
  });
});

The first run creates a snapshot file. Subsequent runs compare against it. When you intentionally change a schema, run vitest run --update-snapshots and commit the updated snapshot. An unintentional change fails the test.

Coverage in CI

# .github/workflows/ci.yml
- name: Run tests with coverage
  run: npx vitest run --coverage

- name: Upload HTML coverage report
  uses: actions/upload-artifact@v4
  if: always()
  with:
    name: coverage-${{ github.sha }}
    path: coverage/
    retention-days: 30

# Optional: fail PR if coverage drops from the base branch
- name: Coverage comment on PR
  uses: davelosert/vitest-coverage-report-action@v2
  if: github.event_name == 'pull_request'
  with:
    json-summary-path: coverage/coverage-summary.json

The vitest-coverage-report-action posts a coverage diff comment on pull requests, showing which files gained or lost coverage. This is more actionable than a single threshold gate — a PR that drops coverage from 85% to 84% may be acceptable, while one that drops a single file from 100% to 60% warrants review.

What coverage cannot tell you

High coverage does not mean the server works in production. Specific failure modes that tests with 100% coverage miss:

Database migration failures — tests use an in-memory database with the current schema; a migration script that fails only with real PostgreSQL is invisible to coverage
Network failures — InMemoryTransport never loses a message; a real SSE connection can disconnect mid-response
Environment configuration errors — missing environment variables that would crash the server in production are not exercised by tests that pass fake deps
Protocol-level health — coverage says your code ran; AliveMCP says your deployed server actually responds to the MCP initialize request from the network

Coverage is a necessary but not sufficient condition for a reliable MCP server. Combine it with integration tests that use real infrastructure and AliveMCP for continuous production monitoring.