Guide · Multi-Tenant SaaS
Billing Integration for MCP Servers — Stripe, usage-based pricing, and quota webhooks
An MCP server that provides valuable tools to AI agents has two commercial models: flat-rate subscription (pay per seat or per server) or usage-based billing (pay per tool call or per compute unit). Usage-based billing aligns cost with value delivered — agents that use more tools generate more revenue — but requires tight integration between the MCP server's metering layer and the billing system. This guide covers Stripe Billing integration for metered MCP services: creating products and prices for per-tool-call billing, reporting usage from the MCP server, managing subscription lifecycle via webhooks, enforcing plan limits at the tool handler level, and giving tenants access to their own billing portal.
TL;DR
Create a Stripe metered price (billing scheme per_unit, aggregate usage sum), store the subscriptionItem.id per tenant, and call stripe.subscriptionItems.createUsageRecord in batches from a background job. Handle customer.subscription.updated and customer.subscription.deleted webhooks to sync plan changes to your database immediately. Enforce plan limits at tool call time using a Redis counter — never rely on Stripe's API for quota enforcement (it's too slow for the hot path).
Stripe product and price setup
Stripe Billing models metered usage as a subscription to a metered price. The price defines the unit cost; you report quantities; Stripe invoices the customer at the end of the billing period.
// setup-billing.ts — run once to create your Stripe products and prices
import Stripe from 'stripe';
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);
// Create the product (your MCP SaaS offering)
const product = await stripe.products.create({
name: 'MCP Server Pro',
description: 'Managed MCP server with uptime monitoring and per-tool metering',
});
// Free tier: flat rate, no metering
const freePlan = await stripe.prices.create({
product: product.id,
currency: 'usd',
unit_amount: 0,
recurring: { interval: 'month' },
nickname: 'Free',
metadata: { plan: 'free', tool_call_limit: '100' },
});
// Pro tier: flat monthly base + metered overage
const proPlanBase = await stripe.prices.create({
product: product.id,
currency: 'usd',
unit_amount: 2900, // $29.00/month base
recurring: { interval: 'month' },
nickname: 'Pro Base',
metadata: { plan: 'pro' },
});
// Metered overage price: $0.001 per tool call above the included 10,000
const proPlanOverage = await stripe.prices.create({
product: product.id,
currency: 'usd',
billing_scheme: 'per_unit',
unit_amount_decimal: '0.1', // $0.001 = 0.1 cents per unit
recurring: {
interval: 'month',
usage_type: 'metered',
aggregate_usage: 'sum',
},
nickname: 'Pro Overage (per tool call)',
metadata: { plan: 'pro', type: 'overage' },
});
Store the price IDs in your application config or environment variables. When a tenant subscribes, create a Stripe Subscription with both prices (base + overage) and store the resulting subscriptionItem.id for the metered price — this is what you reference when reporting usage.
Subscription creation and tenant linking
// When a tenant completes checkout, create their subscription
async function createTenantSubscription(
tenantId: string,
stripeCustomerId: string,
planName: 'free' | 'pro',
): Promise<void> {
const subscription = await stripe.subscriptions.create({
customer: stripeCustomerId,
items: planName === 'pro'
? [
{ price: process.env.STRIPE_PRO_BASE_PRICE_ID },
{ price: process.env.STRIPE_PRO_OVERAGE_PRICE_ID },
]
: [{ price: process.env.STRIPE_FREE_PRICE_ID }],
metadata: { tenant_id: tenantId },
trial_period_days: 14,
});
// Find the metered subscription item ID (the overage price)
const meteredItem = subscription.items.data.find(
item => item.price.id === process.env.STRIPE_PRO_OVERAGE_PRICE_ID
);
// Store in database for usage reporting
await db.query(`
UPDATE tenants SET
stripe_customer_id = $2,
stripe_subscription_id = $3,
stripe_subscription_item_id = $4,
plan = $5,
plan_updated_at = NOW()
WHERE id = $1
`, [
tenantId,
stripeCustomerId,
subscription.id,
meteredItem?.id ?? null,
planName,
]);
}
Reporting usage to Stripe
Usage records must be reported before Stripe generates the invoice at the end of the billing period. Report in batches from a background job rather than on every tool call:
// usage-reporter.ts — runs as a background job every 5 minutes
import Stripe from 'stripe';
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);
export async function reportPendingUsage(): Promise<void> {
// Get all usage events not yet reported to Stripe
const events = await db.query<{
tenant_id: string;
stripe_subscription_item_id: string;
total_calls: number;
period_start: number;
}>(`
SELECT
ue.tenant_id,
t.stripe_subscription_item_id,
COUNT(*) AS total_calls,
EXTRACT(EPOCH FROM MIN(ue.created_at))::INTEGER AS period_start
FROM usage_events ue
JOIN tenants t ON t.id = ue.tenant_id
WHERE ue.reported_to_stripe_at IS NULL
AND t.stripe_subscription_item_id IS NOT NULL
AND t.plan != 'free'
GROUP BY ue.tenant_id, t.stripe_subscription_item_id
`);
for (const event of events.rows) {
try {
await stripe.subscriptionItems.createUsageRecord(
event.stripe_subscription_item_id,
{
quantity: event.total_calls,
timestamp: event.period_start,
action: 'increment',
}
);
// Mark events as reported
await db.query(`
UPDATE usage_events
SET reported_to_stripe_at = NOW()
WHERE tenant_id = $1 AND reported_to_stripe_at IS NULL
`, [event.tenant_id]);
} catch (err) {
console.error(`Failed to report usage for tenant ${event.tenant_id}`, err);
// Leave unreported — next job run will retry
}
}
}
Subscription lifecycle webhooks
Stripe sends webhooks when subscription state changes. Your MCP server must handle these to keep plan limits in sync — a tenant who cancels their Pro subscription should have their quota downgraded within seconds, not at the next page load:
// webhook.ts — subscription lifecycle handlers
const STRIPE_EVENTS_TO_HANDLE = new Set([
'customer.subscription.created',
'customer.subscription.updated',
'customer.subscription.deleted',
'invoice.payment_failed',
'invoice.payment_succeeded',
]);
async function handleSubscriptionUpdated(subscription: Stripe.Subscription): Promise<void> {
const tenantId = subscription.metadata.tenant_id;
if (!tenantId) return;
const plan = getPlanFromSubscription(subscription);
const status = subscription.status; // active, past_due, canceled, paused, etc.
await db.query(`
UPDATE tenants SET
plan = $2,
subscription_status = $3,
plan_updated_at = NOW()
WHERE id = $1
`, [tenantId, plan, status]);
// Invalidate plan cache so the metering layer picks up new limits immediately
await redis.del(`tenant_plan:${tenantId}`);
}
async function handlePaymentFailed(invoice: Stripe.Invoice): Promise<void> {
const customerId = invoice.customer as string;
// Downgrade to free tier or set status = 'past_due' to soft-block premium features
await db.query(`
UPDATE tenants SET subscription_status = 'past_due' WHERE stripe_customer_id = $1
`, [customerId]);
}
function getPlanFromSubscription(subscription: Stripe.Subscription): string {
// Determine plan from active price metadata
for (const item of subscription.items.data) {
const plan = item.price.metadata?.plan;
if (plan) return plan;
}
return 'free';
}
Customer billing portal
Tenants need self-service access to their invoices, payment methods, and subscription management. Stripe's Customer Portal handles all of this — you redirect the tenant to a Stripe-hosted URL, and Stripe handles the UI:
// billing-portal.ts — API endpoint to redirect tenant to Stripe portal
app.post('/api/billing/portal', requireAuth, async (req, res) => {
const tenant = req.tenant;
if (!tenant.stripe_customer_id) {
return res.status(400).json({ error: 'No billing account found' });
}
const session = await stripe.billingPortal.sessions.create({
customer: tenant.stripe_customer_id,
return_url: `${process.env.APP_URL}/settings/billing`,
});
res.json({ url: session.url });
});
// Expose current usage to the tenant dashboard
app.get('/api/billing/usage', requireAuth, async (req, res) => {
const tenant = req.tenant;
// Current period usage from local database (authoritative for real-time display)
const usage = await db.query(`
SELECT
COUNT(*) AS total_calls,
DATE_TRUNC('month', MIN(created_at)) AS period_start,
DATE_TRUNC('month', MAX(created_at)) + INTERVAL '1 month' AS period_end
FROM usage_events
WHERE tenant_id = $1
AND created_at >= DATE_TRUNC('month', NOW())
`, [tenant.id]);
const plan = await getTenantPlan(tenant.id);
const limit = PLAN_LIMITS[plan] ?? 100;
res.json({
plan,
period_start: usage.rows[0].period_start,
period_end: usage.rows[0].period_end,
calls_used: parseInt(usage.rows[0].total_calls),
calls_limit: limit === Infinity ? null : limit,
});
});
Monitoring the billing integration
Billing integration failures are financially significant but often silent. Three failure modes to monitor:
| Failure | Symptom | Detection |
|---|---|---|
| Stripe webhook delivery failure | Plan changes not synced; canceled tenants keep Pro access | Track last webhook received timestamp in /health |
| Usage report backlog | Tenants under-billed; usage events pile up in DB | Alert when unreported events older than 2× report interval |
| Stripe API outage | Usage reporting fails; local metering still works | Track last successful Stripe API call in /health |
// /health additions for billing infrastructure
async function getBillingHealth() {
const [webhookAge, unreportedCount, lastStripeSuccess] = await Promise.all([
// Age of last received webhook
db.query('SELECT MAX(received_at) AS last FROM stripe_webhook_log').then(r => {
const last = r.rows[0]?.last;
return last ? Date.now() - new Date(last).getTime() : Infinity;
}),
// Unreported events older than 10 minutes
db.query(`
SELECT COUNT(*) AS count FROM usage_events
WHERE reported_to_stripe_at IS NULL
AND created_at < NOW() - INTERVAL '10 minutes'
`).then(r => parseInt(r.rows[0].count)),
// Last successful Stripe API call (updated by usage reporter)
redis.get('billing:last_stripe_success').then(v => v ? Date.now() - parseInt(v) : Infinity),
]);
return {
webhook_last_received_ms_ago: webhookAge,
unreported_events_over_10m: unreportedCount,
stripe_api_last_success_ms_ago: lastStripeSuccess,
status: (
webhookAge > 60 * 60 * 1000 || // no webhook in 1 hour
unreportedCount > 500 || // backlog of 500+ events
lastStripeSuccess > 30 * 60 * 1000 // no Stripe success in 30 min
) ? 'degraded' : 'ok',
};
}
Add this check to your /health endpoint and point AliveMCP at it. Billing degradation that goes undetected for hours can mean under-billing that's hard to reconcile retroactively, or over-billing that causes customer disputes.
Frequently asked questions
Should I use Stripe's metered billing or a separate metering database?
Both. Stripe is the billing source of truth (invoices, payment, customer records) but is not suitable as a real-time quota enforcement store — Stripe's API adds 50–300ms per call, which is too slow for every tool handler. Use a local Redis counter for real-time quota enforcement (fast, cheap, purpose-built for counters) and a local PostgreSQL usage_events table as an audit log. Report from the audit log to Stripe in batches every 5 minutes. This architecture means: quota enforcement is always fast (Redis), billing is always accurate (PostgreSQL audit log is never lost), and Stripe receives authoritative usage data for invoicing.
What happens if Stripe is down during a billing period end?
Stripe invoicing happens on Stripe's servers, not yours. If Stripe is down when a billing period ends, Stripe will generate the invoice when they recover, using the usage records you've reported. As long as your usage events are in the local audit log, you can report them retroactively — Stripe allows createUsageRecord with past timestamps. The key is that you never lose usage events from your local store. Never delete from usage_events until reported_to_stripe_at IS NOT NULL. Stripe's own reliability SLA is high enough that this scenario is rare, but the local audit log ensures you can always reconcile.
How do I handle trial periods in the quota enforcement layer?
During a Stripe trial (subscription.status === 'trialing'), the tenant is on the subscribed plan with no charge. Your quota enforcement should use the trial plan's limits — treat a trialing Pro tenant like an active Pro tenant. Store trial_end from the Stripe subscription and include it in your tenant plan lookup. When the trial ends, Stripe sends customer.subscription.updated with status changing from trialing to active (if payment succeeds) or canceled (if payment fails). Handle these webhooks to either continue Pro access or downgrade to free.
Can I use Stripe's usage-based billing for per-token LLM costs passed through to tenants?
Yes. Create a separate metered price for "LLM tokens" with your pass-through margin applied. When a tool call invokes an LLM internally, track the token count (from the LLM API response), enqueue a llm_token usage event with the token count, and report to the LLM tokens subscription item separately from tool call counts. This creates two line items on the tenant's invoice: "tool calls" and "LLM tokens." Stripe supports multiple metered subscription items on one subscription. The operational complexity increases — two metered items means two separate reporting streams — but it gives tenants transparent billing for pass-through costs.
How do I offer volume discounts for high-usage tenants?
Use Stripe's tiered pricing. Instead of a flat per_unit billing scheme, configure a tiered scheme with graduated or volume tiers. For example: $0.01/call for calls 1–1,000, $0.005/call for calls 1,001–10,000, $0.001/call for calls 10,001+. Stripe calculates the discount automatically based on reported usage — you don't need to implement tiering logic in your MCP server. For enterprise customers with custom pricing, use manually-created Stripe prices with the negotiated rates, applied to that customer's subscription only. Stripe's per-customer pricing capability means you can have a standard price list and custom prices coexisting in the same Stripe account.
Further reading
- Usage Metering for MCP Servers — Redis counters, quota enforcement, and event pipelines
- Rate Limiting for MCP Servers — token bucket and per-tenant fair queuing
- Multi-Tenant MCP Server Architecture — tenant context, routing, and isolation
- API Key Management for MCP Servers — per-tenant key scoping and rotation
- MCP Server Health Checks — monitoring billing infrastructure in readiness probes