Building Model Context Protocol servers on Vercel

The Model Context Protocol (MCP) has emerged as the foundational standard for connecting AI systems to external tools and data sources, with vercel/mcp-handler providing a production-ready implementation for the Vercel platform. This comprehensive guide explores MCP server development from concept through deployment, incorporating the latest updates and best practices for 2025.

Understanding MCP from a developer perspective

Think of MCP servers as specialized API servers for AI applications. Just like a REST API exposes endpoints for web clients, an MCP server exposes tools, resources, and prompts for AI models to use. The key difference: instead of returning JSON data for rendering, you're returning structured information that AI models can understand and act upon.

Here's the practical breakdown:

MCP Server = Your API Server
You write functions that AI can call. These are just like API endpoints, but instead of HTTP routes like /api/users, you define tools like get_user_data or process_payment.

Tools = API Endpoints
Each tool is essentially a function with input validation (using Zod schemas) and a return value. When Claude or Gemini needs to perform an action, they call your tool just like a frontend would call your API.

Resources = Static Data Endpoints
Think of these as GET endpoints that return contextual information. Resources can be static (like configuration) or dynamic (like user profiles). They're read-only data sources the AI can reference.

Prompts = Request Templates
These are pre-configured templates for common AI operations - similar to GraphQL fragments or saved Postman requests. They help standardize how AI models interact with your server.

The vercel/mcp-handler package handles all the protocol complexity - WebSocket connections, message routing, error handling - so you can focus on writing business logic. It's like Express.js for MCP: you define your handlers, and the framework manages the infrastructure.

// This is all you need to understand:
// 1. AI calls your tool
// 2. You process the request
// 3. You return a response
// Everything else is handled by the framework

Setting up vercel/mcp-handler with comprehensive examples

Installation begins with the core dependencies that form the foundation of any MCP server:

npm install mcp-handler @modelcontextprotocol/sdk zod@^3

The basic server structure demonstrates the elegance of the handler pattern. This Next.js implementation creates a simple dice-rolling tool that showcases the essential components:

// app/api/[transport]/route.ts
import { createMcpHandler } from "mcp-handler";
import { z } from "zod";

const handler = createMcpHandler(
  (server) => {
    server.tool(
      "roll_dice",
      "Rolls an N-sided die",
      { sides: z.number().int().min(2).max(100) },
      async ({ sides }) => {
        const value = 1 + Math.floor(Math.random() * sides);
        return {
          content: [{ type: "text", text: `🎲 You rolled a ${value}!` }],
        };
      }
    );
  },
  {
    capabilities: {
      tools: { roll_dice: { description: "Roll a dice with specified sides" } },
    },
  },
  {
    redisUrl: process.env.REDIS_URL,
    basePath: "/api",
    maxDuration: 60,
    verboseLogs: process.env.NODE_ENV === "development",
  }
);

export { handler as GET, handler as POST, handler as DELETE };

The configuration object accepts several critical parameters. redisUrl enables SSE transport state management for maintaining connection persistence across requests. basePath must match your route structure, determining where MCP endpoints are accessible. maxDuration sets connection timeouts, while verboseLogs assists with development debugging.

For production environments, the vercel.json configuration becomes essential:

{
  "$schema": "https://openapi.vercel.sh/vercel.json",
  "functions": {
    "api/[transport]/route.ts": {
      "maxDuration": 800,
      "memory": 1024
    }
  },
  "env": {
    "REDIS_URL": "@redis-url",
    "MCP_API_KEY": "@mcp-api-key"
  }
}

Implementing tools, resources, and prompts

Tools represent the primary interaction mechanism in MCP, enabling AI models to execute actions and retrieve information. A comprehensive tool implementation demonstrates validation, error handling, and response formatting:

server.tool(
  "process_data",
  "Analyzes dataset with statistical operations",
  {
    data: z.array(z.object({
      id: z.string().uuid(),
      value: z.number(),
      timestamp: z.string().datetime(),
      metadata: z.record(z.any()).optional(),
    })).min(1, "Data array cannot be empty"),
    operation: z.enum(["sum", "average", "max", "min", "stddev"]),
    groupBy: z.string().optional(),
  },
  async ({ data, operation, groupBy }) => {
    try {
      const values = data.map(item => item.value);
      let result: number;

      switch (operation) {
        case "sum":
          result = values.reduce((acc, val) => acc + val, 0);
          break;
        case "average":
          result = values.reduce((acc, val) => acc + val, 0) / values.length;
          break;
        case "stddev":
          const mean = values.reduce((a, b) => a + b, 0) / values.length;
          const variance = values.reduce((acc, val) => 
            acc + Math.pow(val - mean, 2), 0) / values.length;
          result = Math.sqrt(variance);
          break;
        case "max":
          result = Math.max(...values);
          break;
        case "min":
          result = Math.min(...values);
          break;
      }

      return {
        content: [{
          type: "text",
          text: `${operation.toUpperCase()} of ${data.length} items: ${result.toFixed(2)}`
        }],
      };
    } catch (error) {
      return {
        content: [{
          type: "text",
          text: `Processing error: ${error.message}`
        }],
        isError: true,
      };
    }
  }
);

Resources provide contextual data to AI models, supporting both static and dynamic content. Dynamic resource templates enable parameterized data access:

import { 
  ListResourceTemplatesRequestSchema,
  ReadResourceRequestSchema,
} from "@modelcontextprotocol/sdk/types.js";

server.setRequestHandler(ListResourceTemplatesRequestSchema, () => ({
  resourceTemplates: [
    {
      uriTemplate: "user://{userId}/profile",
      name: "User Profile",
      description: "Complete user profile with preferences",
      mimeType: "application/json",
    },
    {
      uriTemplate: "analytics://{metric}/{timeRange}",
      name: "Analytics Data",
      description: "Time-series analytics for specified metrics",
      mimeType: "application/json",
    },
  ],
}));

server.setRequestHandler(ReadResourceRequestSchema, async (request) => {
  const { uri } = request.params;
  
  const userMatch = uri.match(/^user:\/\/([^\/]+)\/profile$/);
  if (userMatch) {
    const userId = decodeURIComponent(userMatch[1]);
    const userData = await fetchUserFromDatabase(userId);
    
    return {
      contents: [{
        uri,
        text: JSON.stringify(userData, null, 2),
        mimeType: "application/json",
      }],
    };
  }
  
  const analyticsMatch = uri.match(/^analytics:\/\/([^\/]+)\/(.+)$/);
  if (analyticsMatch) {
    const [, metric, timeRange] = analyticsMatch;
    const data = await fetchAnalytics(metric, timeRange);
    
    return {
      contents: [{
        uri,
        text: JSON.stringify(data, null, 2),
        mimeType: "application/json",
      }],
    };
  }
  
  throw new Error(`Resource not found: ${uri}`);
});

Prompts provide reusable templates for common AI interactions, particularly valuable for complex multi-step operations:

import { GetPromptRequestSchema } from "@modelcontextprotocol/sdk/types.js";

server.setRequestHandler(GetPromptRequestSchema, (request) => {
  const { name, arguments: args } = request.params;
  
  if (name === "code-review") {
    const { language, code, focus } = args as {
      language: string;
      code: string;
      focus?: string;
    };
    
    return {
      messages: [
        {
          role: "user",
          content: {
            type: "text",
            text: `Review this ${language} code${focus ? ` focusing on ${focus}` : ''}:

\`\`\`${language}
${code}
\`\`\`

Analyze for:
- Potential bugs and edge cases
- Performance optimizations
- Security vulnerabilities
- Code style and best practices
- Suggested refactoring improvements`,
          },
        },
      ],
    };
  }
  
  throw new Error(`Prompt not found: ${name}`);
});

Authentication patterns and security implementation

Security represents a critical consideration for production MCP servers. The OAuth 2.1 implementation follows the latest standards including Resource Indicators (RFC 8707) to prevent token mis-redemption:

import { createMcpHandler, withMcpAuth } from "mcp-handler";
import { AuthInfo } from "@modelcontextprotocol/sdk/server/auth/types.js";

const baseHandler = createMcpHandler((server) => {
  server.tool(
    "protected_operation",
    "Performs sensitive data operations",
    { 
      operation: z.enum(["read", "write", "delete"]),
      resource: z.string(),
    },
    async ({ operation, resource }, extra) => {
      const authInfo = extra.authInfo;
      
      // Scope-based authorization
      if (operation === "delete" && !authInfo?.scopes?.includes("admin:delete")) {
        return {
          content: [{ type: "text", text: "Insufficient permissions for delete" }],
          isError: true,
        };
      }
      
      // Execute operation with user context
      const result = await performOperation(operation, resource, authInfo?.clientId);
      
      return {
        content: [{ 
          type: "text", 
          text: `${operation} completed for ${authInfo?.clientId}: ${result}` 
        }],
      };
    }
  );
});

const verifyToken = async (
  req: Request,
  bearerToken?: string
): Promise<AuthInfo | undefined> => {
  if (!bearerToken) return undefined;
  
  try {
    // Validate token with authorization server
    const response = await fetch(process.env.AUTH_SERVER_URL + "/introspect", {
      method: "POST",
      headers: {
        "Content-Type": "application/x-www-form-urlencoded",
        "Authorization": `Basic ${Buffer.from(
          `${process.env.CLIENT_ID}:${process.env.CLIENT_SECRET}`
        ).toString("base64")}`,
      },
      body: new URLSearchParams({
        token: bearerToken,
        token_type_hint: "access_token",
      }),
    });
    
    if (!response.ok) return undefined;
    
    const introspection = await response.json();
    
    if (!introspection.active) return undefined;
    
    return {
      token: bearerToken,
      scopes: introspection.scope?.split(" ") || [],
      clientId: introspection.sub,
      extra: {
        userId: introspection.sub,
        organizationId: introspection.org_id,
        permissions: introspection.permissions,
      },
    };
  } catch (error) {
    console.error("Token verification failed:", error);
    return undefined;
  }
};

const authHandler = withMcpAuth(baseHandler, verifyToken, {
  required: true,
  requiredScopes: ["mcp:read", "mcp:execute"],
  resourceMetadataPath: "/.well-known/oauth-protected-resource",
});

export { authHandler as GET, authHandler as POST };

The OAuth protected resource metadata endpoint provides discovery information:

// app/.well-known/oauth-protected-resource/route.ts
import { protectedResourceHandler } from "mcp-handler";

const handler = protectedResourceHandler({
  authServerUrls: [process.env.AUTH_SERVER_URL],
});

export { handler as GET };

Deployment considerations specific to Vercel

Vercel's platform offers two function types with distinct trade-offs for MCP servers. Serverless Functions provide the recommended approach due to full Node.js runtime compatibility, longer execution limits up to 900 seconds on Enterprise plans, and comprehensive npm package support. While Edge Functions offer 40% faster cold starts and 15x lower costs, their V8-only runtime and 4MB size limit make them unsuitable for complex MCP implementations.

The deployment architecture leverages Vercel's Fluid Compute for 90% cost savings compared to traditional serverless, particularly beneficial for AI workloads with irregular usage patterns. Cold start mitigation strategies become essential for maintaining responsiveness:

// api/keepalive.js - Prevent cold starts
export default async function handler(req, res) {
  // Simple health check to keep function warm
  const timestamp = new Date().toISOString();
  res.status(200).json({ status: "warm", timestamp });
}

// Configure cron job in vercel.json
{
  "crons": [{
    "path": "/api/keepalive",
    "schedule": "*/5 * * * *"
  }]
}

CORS configuration ensures proper client connectivity across different domains:

export const config = {
  api: {
    cors: {
      origin: ["https://claude.ai", "https://cursor.sh", "https://gemini.google.com"],
      methods: ["GET", "POST", "OPTIONS"],
      allowedHeaders: ["Content-Type", "Authorization", "X-API-Key"],
      credentials: true,
    }
  }
}

Debugging, testing, and monitoring strategies

Development begins with local testing using the MCP Inspector, providing immediate feedback on server functionality:

# Start development server
vercel dev

# Test with MCP Inspector
npx @modelcontextprotocol/inspector http://localhost:3000/api/mcp

Comprehensive logging facilitates debugging in production environments:

import { createLogger } from "./utils/logger";

const logger = createLogger({
  service: "mcp-server",
  environment: process.env.VERCEL_ENV,
});

server.tool("complex_operation", "Performs complex processing", schema, 
  async (args) => {
    const requestId = crypto.randomUUID();
    
    logger.info("Tool execution started", { 
      requestId,
      tool: "complex_operation",
      args: JSON.stringify(args),
    });
    
    try {
      const result = await performComplexOperation(args);
      
      logger.info("Tool execution completed", { 
        requestId,
        duration: Date.now() - startTime,
        resultSize: JSON.stringify(result).length,
      });
      
      return result;
    } catch (error) {
      logger.error("Tool execution failed", { 
        requestId,
        error: error.message,
        stack: error.stack,
      });
      
      throw error;
    }
  }
);

Integration testing validates end-to-end functionality:

import { describe, it, expect, beforeAll } from "vitest";
import { Client } from "@modelcontextprotocol/sdk/client";
import { SSEClientTransport } from "@modelcontextprotocol/sdk/client/sse";

describe("MCP Server Integration", () => {
  let client: Client;
  
  beforeAll(async () => {
    client = new Client({
      transport: new SSEClientTransport({
        url: "http://localhost:3000/api/mcp/sse",
      }),
    });
    
    await client.connect();
  });
  
  it("should execute tools with proper validation", async () => {
    const result = await client.callTool({
      name: "process_data",
      args: {
        data: [
          { id: "123e4567-e89b-12d3-a456-426614174000", value: 42, timestamp: new Date().toISOString() },
          { id: "987f6543-e21b-12d3-a456-426614174111", value: 84, timestamp: new Date().toISOString() },
        ],
        operation: "average",
      },
    });
    
    expect(result.content[0].text).toContain("AVERAGE of 2 items: 63.00");
  });
});

Production monitoring leverages Vercel's built-in analytics alongside custom metrics:

const metrics = {
  toolExecutions: new Map(),
  errors: [],
  responseTimings: [],
};

const withMetrics = (toolName: string, toolFunction: Function) => {
  return async (...args: any[]) => {
    const startTime = Date.now();
    const executionId = crypto.randomUUID();
    
    metrics.toolExecutions.set(executionId, {
      tool: toolName,
      startTime,
      status: "running",
    });
    
    try {
      const result = await toolFunction(...args);
      const duration = Date.now() - startTime;
      
      metrics.toolExecutions.set(executionId, {
        ...metrics.toolExecutions.get(executionId),
        status: "completed",
        duration,
      });
      
      metrics.responseTimings.push(duration);
      
      return result;
    } catch (error) {
      metrics.errors.push({
        tool: toolName,
        error: error.message,
        timestamp: new Date().toISOString(),
      });
      
      throw error;
    }
  };
};

Real-world use cases and architectural patterns

Production deployments demonstrate diverse MCP applications across industries. Block (Square) operates 60+ MCP servers for internal financial tools, implementing domain-driven design with separate servers for payments, inventory, and analytics. Netflix leverages MCP for content management workflows, using multi-server patterns to handle metadata, transcoding, and recommendation systems independently.

The multi-server architecture pattern separates concerns effectively:

// Core business logic server
const coreServer = createMcpHandler((server) => {
  server.tool("create_order", "Creates new order", orderSchema, 
    async (orderData) => {
      const order = await createOrder(orderData);
      await publishEvent("order.created", order);
      return { content: [{ type: "text", text: `Order ${order.id} created` }] };
    }
  );
});

// Analytics server with read-only operations
const analyticsServer = createMcpHandler((server) => {
  server.tool("generate_report", "Generates analytics report", reportSchema,
    async (params) => {
      const report = await generateReport(params);
      return { content: [{ type: "text", text: JSON.stringify(report) }] };
    }
  );
});

// Admin server with elevated permissions
const adminServer = createMcpHandler((server) => {
  server.tool("manage_users", "User management operations", userManagementSchema,
    async (operation, extra) => {
      if (!extra.authInfo?.scopes?.includes("admin:users")) {
        throw new Error("Admin permissions required");
      }
      const result = await performUserManagement(operation);
      return { content: [{ type: "text", text: result }] };
    }
  );
});

Circuit breaker patterns ensure resilience when integrating external services:

class CircuitBreaker {
  private failureCount = 0;
  private lastFailureTime?: number;
  private state: "CLOSED" | "OPEN" | "HALF_OPEN" = "CLOSED";
  
  constructor(
    private threshold = 5,
    private timeout = 60000,
    private resetTimeout = 120000
  ) {}
  
  async execute<T>(operation: () => Promise<T>): Promise<T> {
    if (this.state === "OPEN") {
      if (Date.now() - this.lastFailureTime! > this.resetTimeout) {
        this.state = "HALF_OPEN";
        this.failureCount = 0;
      } else {
        throw new Error("Circuit breaker is OPEN - service unavailable");
      }
    }
    
    try {
      const result = await Promise.race([
        operation(),
        new Promise<never>((_, reject) => 
          setTimeout(() => reject(new Error("Operation timeout")), this.timeout)
        ),
      ]);
      
      if (this.state === "HALF_OPEN") {
        this.state = "CLOSED";
        this.failureCount = 0;
      }
      
      return result;
    } catch (error) {
      this.failureCount++;
      this.lastFailureTime = Date.now();
      
      if (this.failureCount >= this.threshold) {
        this.state = "OPEN";
      }
      
      throw error;
    }
  }
}

const breaker = new CircuitBreaker();

server.tool("external_api_call", "Calls external API with circuit breaker", schema,
  async (params) => {
    try {
      const result = await breaker.execute(async () => {
        const response = await fetch("https://api.external.com/data", {
          method: "POST",
          body: JSON.stringify(params),
        });
        return response.json();
      });
      
      return { content: [{ type: "text", text: JSON.stringify(result) }] };
    } catch (error) {
      return {
        content: [{ 
          type: "text", 
          text: "External service temporarily unavailable - please retry later" 
        }],
        isError: true,
      };
    }
  }
);

Recent updates and ecosystem evolution

The MCP ecosystem has experienced explosive growth since its November 2024 release. Major platform adoptions in 2025 include Google's comprehensive integration across Gemini desktop and Agents SDK in March, Microsoft's native Copilot Studio support with one-click server connections in May, and AWS's official MCP servers for Lambda, ECS, and other services in July.

Security improvements have addressed early vulnerabilities through OAuth 2.1 implementation with Resource Indicators (RFC 8707), enhanced tool annotations for behavior declaration, and comprehensive audit logging capabilities. The protocol now supports audio content alongside text and images, JSON-RPC batching for improved efficiency, and Streamable HTTP transport replacing the deprecated SSE mechanism.

The community has created over 1000 MCP servers by February 2025, spanning diverse use cases from financial services (Stripe, Block) to developer tools (GitHub, VS Code) to data platforms (Databricks, Supabase). Enterprise deployments demonstrate MCP's production readiness, with companies like Netflix and Block operating dozens of servers in production environments.

Market projections suggest the MCP ecosystem will reach $4.5 billion by end of 2025, with 90% of organizations expected to adopt the protocol. Technical evolution continues with Google's Agent2Agent protocol addressing multi-agent coordination, expanded cloud platform integrations, and ongoing security framework maturation.

Best practices and architectural recommendations

Successful MCP server implementation requires careful attention to security, performance, and maintainability. Security-first design implements OAuth 2.1 from the start, validates all inputs using Zod schemas, implements rate limiting and circuit breakers, and maintains comprehensive audit logs. Container isolation using Docker or Kubernetes provides additional security boundaries for production deployments.

Performance optimization leverages Redis for connection state management, implements intelligent caching strategies, uses connection pooling for database access, and monitors metrics continuously. The multi-server pattern separates read and write operations, isolates heavy processing workloads, and enables independent scaling of different capabilities.

Development workflow best practices include starting with Streamable HTTP transport for new projects, using official SDKs for consistent implementation, implementing comprehensive error handling with fallbacks, and maintaining high test coverage including integration tests. Documentation should cover API specifications, authentication requirements, example requests and responses, and troubleshooting guides.

Deployment strategies prioritize gradual rollout with feature flags, comprehensive monitoring from day one, regular security audits and updates, and disaster recovery planning. Organizations should start with low-risk pilot programs, validate patterns with small-scale deployments, and expand gradually based on proven success.

The Model Context Protocol represents a fundamental shift in how AI systems interact with external tools and data sources. The combination of MCP's thoughtful protocol design and vercel/mcp-handler's production-ready implementation creates a powerful foundation for building sophisticated AI integrations. As the ecosystem continues maturing with enhanced security features, broader platform adoption, and proven enterprise deployments, MCP establishes itself as the standard protocol for the next generation of AI-powered applications.