Guide · MCP Agentic Framework Integration

MCP Server DSPy — typed signatures, ChainOfThought, and MCP tool adapters

DSPy treats LLM programs as typed modules — signatures define what goes in and what comes out, and optimizers tune prompts using examples rather than hand-written instructions. This creates two integration points with MCP: DSPy programs can call MCP tools as external capabilities (via a tool adapter), and MCP servers can use DSPy modules internally to implement intelligent tool handlers. This guide covers both patterns, plus the key monitoring concern: DSPy programs that depend on MCP tools have a distributed failure surface that requires health monitoring at the tool-call layer, not just the LLM layer.

TL;DR

To call MCP tools from a DSPy program, wrap MCP tool calls in Python functions decorated with DSPy's @dspy.tool decorator (DSPy 2.5+) or used as tools in a ReAct module. To use DSPy inside an MCP server (TypeScript), call the DSPy Python service via HTTP from within tool handlers — DSPy has no TypeScript SDK. Monitor both the MCP server's health endpoint (for server availability) and the DSPy service's response latency (for LLM backend issues). AliveMCP handles the MCP server layer; instrument your DSPy service separately.

DSPy architecture and the two MCP integration patterns

DSPy is a Python framework for programming (not prompting) language models. Instead of writing prompt templates, you define typed Signature classes that declare inputs, outputs, and field descriptions. DSPy modules (ChainOfThought, ReAct, Predict) implement reasoning strategies using those signatures. Optimizers (BootstrapFewShot, MIPRO) automatically generate few-shot examples by running the program on training data.

Two distinct patterns arise when combining DSPy with MCP:

Pattern	Who calls who	Use case
DSPy → MCP	A DSPy ReAct agent uses MCP tools as its action space	Building agents in DSPy that need real-world capabilities (search, database, compute)
MCP → DSPy	An MCP server calls a DSPy program to implement intelligent tool logic	Adding reasoning and RAG to MCP tool responses; optimizable tool logic

The key difference: Pattern 1 is agent-building (DSPy orchestrates, MCP provides capabilities), Pattern 2 is capability-building (MCP exposes a tool, DSPy is the reasoning engine behind it).

Pattern 1: DSPy ReAct agent calling MCP tools

DSPy's ReAct module implements a Reason + Act loop — it generates thoughts, chooses actions (tool calls), observes results, and iterates. You provide tools as Python functions; DSPy generates the tool-calling prompts automatically.

import dspy
import json
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

# Configure LLM backend
lm = dspy.LM('anthropic/claude-sonnet-4-6', api_key=os.environ['ANTHROPIC_API_KEY'])
dspy.configure(lm=lm)

# MCP tool adapter — wraps a single MCP tool call for DSPy
async def call_mcp_tool(session: ClientSession, tool_name: str, **kwargs) -> str:
    """Call an MCP tool and return its text result as a string."""
    result = await session.call_tool(tool_name, arguments=kwargs)
    # MCP returns a list of content blocks — join text blocks
    texts = [block.text for block in result.content if hasattr(block, 'text')]
    return '\n'.join(texts)

# Create DSPy-compatible tool functions from MCP tools
def make_mcp_tool(session: ClientSession, tool_name: str, description: str):
    """Factory that creates a DSPy tool function wrapping an MCP tool."""
    import asyncio

    def tool_fn(**kwargs) -> str:
        # DSPy calls tools synchronously — run the async MCP call in an event loop
        loop = asyncio.new_event_loop()
        try:
            return loop.run_until_complete(call_mcp_tool(session, tool_name, **kwargs))
        finally:
            loop.close()

    tool_fn.__name__ = tool_name
    tool_fn.__doc__ = description
    return tool_fn

# DSPy signature for a research task
class ResearchTask(dspy.Signature):
    """Research a topic by searching documentation and retrieving relevant records."""
    question: str = dspy.InputField(desc="The research question to answer")
    context: str = dspy.InputField(desc="Background context about the domain", default="")
    answer: str = dspy.OutputField(desc="A comprehensive answer with specific examples")
    sources: list[str] = dspy.OutputField(desc="List of sources used to answer the question")

# Build and run the agent
async def run_dspy_mcp_agent(question: str):
    server_params = StdioServerParameters(
        command='node',
        args=['path/to/your-mcp-server/build/index.js']
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # Discover available tools from the MCP server
            tools_response = await session.list_tools()

            # Convert MCP tools to DSPy tool functions
            dspy_tools = [
                make_mcp_tool(session, tool.name, tool.description or tool.name)
                for tool in tools_response.tools
            ]

            # ReAct agent with MCP tools as its action space
            agent = dspy.ReAct(ResearchTask, tools=dspy_tools, max_iters=5)

            result = agent(question=question)
            return {
                'answer': result.answer,
                'sources': result.sources
            }

The asyncio.new_event_loop() pattern inside the synchronous tool function is necessary because DSPy calls tools synchronously, but the MCP Python SDK is async. Running a new event loop per tool call is slightly inefficient — for production, consider running the async MCP session in a dedicated thread with a persistent event loop and bridging tool calls via concurrent.futures.

Pattern 2: DSPy modules inside MCP tool handlers

When an MCP server is written in TypeScript but needs complex reasoning (RAG, multi-step analysis, optimized classification), call a separate DSPy Python service over HTTP from within the MCP tool handler.

# dspy_service.py — FastAPI app wrapping a DSPy module
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import dspy
import os

app = FastAPI()

# Configure once at startup
lm = dspy.LM('anthropic/claude-sonnet-4-6', api_key=os.environ['ANTHROPIC_API_KEY'])
dspy.configure(lm=lm)

# DSPy signature for document Q&A with citations
class DocumentQA(dspy.Signature):
    """Answer a question using retrieved document passages. Cite specific passages."""
    question: str = dspy.InputField()
    passages: list[str] = dspy.InputField(desc="Retrieved document passages to use as context")
    answer: str = dspy.OutputField(desc="Answer grounded in the provided passages")
    confidence: float = dspy.OutputField(desc="Confidence score 0.0-1.0")
    citations: list[int] = dspy.OutputField(desc="Indices of passages used (0-indexed)")

# Use ChainOfThought for step-by-step reasoning
qa_module = dspy.ChainOfThought(DocumentQA)

class QARequest(BaseModel):
    question: str
    passages: list[str]

class QAResponse(BaseModel):
    answer: str
    confidence: float
    citations: list[int]
    reasoning: str | None = None

@app.post('/qa', response_model=QAResponse)
async def answer_question(req: QARequest):
    if not req.passages:
        raise HTTPException(status_code=400, detail='No passages provided')

    result = qa_module(question=req.question, passages=req.passages)

    return QAResponse(
        answer=result.answer,
        confidence=float(result.confidence) if hasattr(result, 'confidence') else 0.8,
        citations=result.citations if isinstance(result.citations, list) else [],
        reasoning=result.get('reasoning', None)  # ChainOfThought adds a reasoning field
    )

@app.get('/health')
async def health():
    return {'status': 'ok', 'lm': lm.model}

// TypeScript MCP server calling the DSPy service
server.tool(
  'answer_from_documents',
  {
    question: z.string().min(1),
    document_ids: z.array(z.string()).min(1).max(10)
  },
  async ({ question, document_ids }) => {
    // 1. Retrieve passages from vector store
    const passages = await Promise.all(
      document_ids.map(id => vectorStore.getDocumentText(id))
    );

    // 2. Call DSPy service for reasoning
    const response = await fetch(`${process.env.DSPY_SERVICE_URL}/qa`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ question, passages }),
      signal: AbortSignal.timeout(30000)  // 30s timeout for LLM calls
    });

    if (!response.ok) {
      throw new McpError(
        ErrorCode.InternalError,
        `DSPy service error: ${response.status}`
      );
    }

    const result = await response.json() as {
      answer: string;
      confidence: number;
      citations: number[];
      reasoning?: string;
    };

    return {
      content: [{
        type: 'text',
        text: JSON.stringify({
          answer: result.answer,
          confidence: result.confidence,
          cited_documents: result.citations.map(i => document_ids[i]),
          reasoning: result.reasoning
        }, null, 2)
      }]
    };
  }
);

DSPy optimizers and MCP tools: what gets optimized

DSPy's optimizers tune prompt instructions and few-shot examples — not the tool implementations themselves. When a ReAct agent uses MCP tools, the optimizer learns:

Which tools to call in which order for different query types
What arguments to pass to tools to get useful responses
How to interpret tool responses and synthesize final answers

import dspy
from dspy.teleprompt import BootstrapFewShot

# Training examples for the optimizer
trainset = [
    dspy.Example(
        question="What's the current status of MCP server prod-01?",
        answer="Server prod-01 is online with 99.9% uptime over the last 30 days."
    ).with_inputs("question"),
    dspy.Example(
        question="Which MCP servers had downtime this week?",
        answer="Servers dev-02 and staging-01 had brief outages on Tuesday."
    ).with_inputs("question"),
    # ... more examples
]

# Metric function: did the agent call the right tools and get a good answer?
def validate_response(example, prediction, trace=None):
    # Simple metric: answer is non-empty and longer than 10 characters
    return len(prediction.answer.strip()) > 10

# Bootstrap optimizer: generates few-shot examples by running the agent on trainset
optimizer = BootstrapFewShot(metric=validate_response, max_bootstrapped_demos=4)

# Run optimization (this makes real LLM calls — it costs money)
compiled_agent = optimizer.compile(agent, trainset=trainset)

# Save the optimized program
compiled_agent.save('optimized_mcp_agent.json')

# Load in production
agent.load('optimized_mcp_agent.json')

The compiled program contains the optimized few-shot examples embedded in its configuration. Loading it with agent.load() applies those examples to every subsequent call without re-running optimization.

Monitoring DSPy + MCP server combinations

When a DSPy program calls MCP tools, there are three failure surfaces to monitor:

Layer	What fails	How to detect
MCP server	Server process crashes, health endpoint unreachable	AliveMCP monitors `/health` endpoint — alerts on 5xx or timeout
DSPy service	Python service crashes, LLM backend unreachable, timeout	Monitor `/health` endpoint on the DSPy FastAPI service
LLM API	OpenAI/Anthropic API outage, rate limits, token budget exceeded	DSPy surfaces these as exceptions — log and alert on error rate

Add a health endpoint to your DSPy FastAPI service (the /health route in the example above) and monitor it with AliveMCP alongside the MCP server's own health endpoint. A DSPy service outage will cause all tool handlers that depend on it to return InternalError — without monitoring the DSPy service directly, these errors look identical to MCP server errors from the caller's perspective.

# Enhanced /health endpoint for the DSPy service
@app.get('/health')
async def health():
    # Test the LLM backend with a minimal call
    start = time.time()
    try:
        test_result = dspy.Predict('question -> answer')(question='What is 1+1?')
        lm_latency_ms = int((time.time() - start) * 1000)
        lm_ok = bool(test_result.answer)
    except Exception as e:
        lm_ok = False
        lm_latency_ms = -1
        lm_error = str(e)

    if not lm_ok:
        return JSONResponse(
            status_code=503,
            content={'status': 'error', 'lm': lm.model, 'lm_error': lm_error}
        )

    return {'status': 'ok', 'lm': lm.model, 'lm_latency_ms': lm_latency_ms}

Frequently asked questions

Does DSPy have a TypeScript SDK?

No — DSPy is Python-only as of mid-2026. For TypeScript MCP servers, the integration point is HTTP: run DSPy as a FastAPI service and call it from TypeScript tool handlers via fetch(). This is actually a clean architectural boundary — the DSPy service can be scaled and deployed independently from the MCP server. The trade-off is an extra network hop per tool call that requires the DSPy service; set realistic timeouts (15–30 seconds for LLM calls) and handle service unavailability with a fallback or clear error.

How does DSPy compare to LangChain for MCP tool calling?

LangChain's @tool decorator and create_react_agent serve a similar purpose to DSPy's ReAct — both create agents that call tools. The key difference: LangChain focuses on composing chains and tools; DSPy focuses on optimizing prompts. DSPy's optimizer can automatically improve how the agent decides which tools to call and how to interpret their results, without you writing few-shot examples by hand. For simple tool-calling agents, LangChain is easier to get started with. For agents where you have training data and want systematic prompt improvement, DSPy's optimization is a genuine advantage. See the MCP Server LangChain guide for the LangChain integration pattern.

Can DSPy's ChainOfThought be used to generate MCP tool parameters?

Yes — and this is one of the most powerful DSPy + MCP patterns. Instead of having the agent pick tools manually, define a signature like class GenerateToolCall(dspy.Signature): user_request: str = dspy.InputField(); tool_name: str = dspy.OutputField(); tool_parameters: dict = dspy.OutputField(). Use dspy.ChainOfThought(GenerateToolCall) to generate the tool call with step-by-step reasoning, then validate the output against the tool's Zod schema before executing. The optimizer can learn which tools and parameter patterns work for different request types from examples.

How do I handle DSPy's LLM calls timing out in an MCP tool handler?

Set an explicit timeout on the HTTP call from your TypeScript MCP handler to the DSPy service: fetch(url, { signal: AbortSignal.timeout(30000) }). Catch TimeoutError and throw an McpError(ErrorCode.InternalError, 'Reasoning service timed out — please retry'). In the DSPy FastAPI service, set a timeout on the LLM call: DSPy's lm() calls respect an api_kwargs timeout parameter — dspy.configure(lm=dspy.LM('anthropic/...', timeout=25)). Having the DSPy service timeout at 25 seconds and the MCP handler timeout at 30 seconds gives the DSPy service time to fail cleanly before the MCP handler gives up.