Next.jsAPR 25, 20267 min read

Production MCP Servers in Next.js 16: A Field Guide

A practitioner's guide to building MCP servers in Next.js 16 that survive load balancers, OAuth, and real users — with a working tool example and the production gotchas that bite after the demo works.

Pravin Harchandani

software architect · 14+ yrs

If you ship Next.js apps for a living, the Model Context Protocol stopped being a curiosity sometime around late 2025 and quietly became part of the job. Next.js 16 ships with a built-in MCP endpoint at /_next/mcp, the Vercel MCP Adapter is now stable, and the conversations on my team have shifted from "should we expose tools to Claude?" to "why is the session sticky on one pod when the agent reconnects to another?"

That second question is the whole article. Building an MCP server that an agent can hit on localhost is a weekend. Building one that survives a load balancer, an OAuth flow, and a horizontally scaled Vercel deployment is where most teams trip. Here's the field guide I wish I'd had when I started wiring MCP into our production Next.js stack.

Why MCP suddenly matters for Next.js devs

The promise is simple: instead of building a custom plugin for every AI client (Claude, Cursor, internal copilots), you expose your app's capabilities once over MCP and any compliant agent can use them. For a typical Next.js SaaS, that means your existing route handlers — the ones already doing auth, validation, and DB writes — become tools an AI can call on the user's behalf.

The practical shift in 2026 is that MCP is no longer just stdio-based local processes. Streamable HTTP is the default transport for anything you'd actually deploy. Your MCP server is now just another Next.js route handler with a slightly different protocol on top. That's the part that makes it interesting for full-stack folks: there's no new runtime, no new deployment story. It's the stack you already run.

The smallest useful MCP server in Next.js 16

Let's build something concrete. Say you run a project management tool and you want Claude to be able to list a user's open tasks and create new ones from a conversation. With the Vercel MCP Adapter, the route handler looks like this:

// app/api/mcp/route.ts
import { createMcpHandler } from '@vercel/mcp-adapter';
import { z } from 'zod';
import { getTasksForUser, createTask } from '@/lib/tasks';
import { requireUser } from '@/lib/auth';

const handler = createMcpHandler(
  (server) => {
    server.tool(
      'list_open_tasks',
      'Returns the current user\u2019s open tasks',
      { limit: z.number().min(1).max(50).default(20) },
      async ({ limit }, { request }) => {
        const user = await requireUser(request);
        const tasks = await getTasksForUser(user.id, { status: 'open', limit });
        return {
          content: [{ type: 'text', text: JSON.stringify(tasks, null, 2) }],
        };
      },
    );

    server.tool(
      'create_task',
      'Creates a task in the user\u2019s default project',
      {
        title: z.string().min(1).max(200),
        due_date: z.string().datetime().optional(),
      },
      async ({ title, due_date }, { request }) => {
        const user = await requireUser(request);
        const task = await createTask({ userId: user.id, title, due_date });
        return { content: [{ type: 'text', text: `Created task ${task.id}` }] };
      },
    );
  },
  {},
  { basePath: '/api/mcp' },
);

export { handler as GET, handler as POST };

That's it. Two tools, full Zod validation, your existing auth helpers reused verbatim. Point Claude Desktop or Cursor at https://yourapp.com/api/mcp and the agent can now manage tasks for the authenticated user.

The first time you watch an agent call create_task with a natural-language sentence the user typed, something clicks. You're not building a chatbot. You're not parsing intent. You wrote a Zod schema and an async function — the protocol handles the rest.

Where the localhost demo breaks in production

Now the part the tutorials skip. Three problems show up the moment you deploy this behind a load balancer.

1. Stateful sessions vs. stateless infrastructure

MCP over Streamable HTTP keeps a session ID per agent connection. The default in-memory session store works fine on a single pod and falls apart the second a second instance comes up — the agent reconnects, hits a different pod, and the server has no memory of it.

The fix is the same fix you've been using for sessions for fifteen years: externalize the store. With the Vercel adapter you can plug in Redis (Upstash works well from India for latency to the edge):

import { createMcpHandler } from '@vercel/mcp-adapter';
import { Redis } from '@upstash/redis';

const redis = Redis.fromEnv();

const handler = createMcpHandler(
  (server) => { /* tools here */ },
  {},
  {
    basePath: '/api/mcp',
    redisUrl: process.env.UPSTASH_REDIS_REST_URL,
  },
);

Now any pod can pick up any session. This single change is the difference between an MCP server that works in your laptop demo and one that survives an actual user base.

2. Auth that the agent can actually do

Your existing Next.js auth probably assumes a browser — cookies, CSRF tokens, redirect-based OAuth. An agent has none of that. The 2026-acceptable pattern is OAuth 2.1 with the Dynamic Client Registration extension that the MCP spec now mandates for remote servers. The agent registers itself, walks the user through consent in a browser window, and gets back an access token it includes as a Bearer header on every MCP call.

If you're already on Auth.js (NextAuth) or Clerk, both shipped MCP-compatible OAuth flows in Q1 2026. The unlock is that requireUser(request) in the handler above just works — same auth helpers, different token source. Don't roll your own. The number of subtle ways an MCP auth implementation can leak tokens is genuinely scary, and the libraries have been audited.

3. Tools that lie about what they do

This one is non-obvious. When an agent decides whether to call your tool, it reads the tool's description. "Returns the current user's open tasks" is fine for a human but ambiguous for an agent — does it return all tasks, or just today's? Does it include completed ones? Does pagination matter?

Treat tool descriptions as part of your API contract, not as docstrings. Spell out edge cases. State the shape of what comes back. After we tightened our descriptions, the rate of agents calling the wrong tool dropped maybe 60%. It's the cheapest reliability win you'll get all quarter.

A real before/after from our stack

We have a Next.js 16 dashboard for an internal ops team. One recurring task: when a customer complaint comes in via email, an ops person has to look up the customer's recent orders in MongoDB, check the shipment status from a third-party Web API, and draft a reply. Average time per complaint: about 7 minutes, mostly tab-switching.

We exposed three MCP tools from the existing Next.js app — find_customer, get_recent_orders, check_shipment_status. No new services, just route handlers wrapping the same Mongoose queries and fetch calls the dashboard already used. The ops team uses Claude Desktop to triage. They paste the complaint email, Claude calls the three tools in sequence, summarizes what happened, and drafts a reply. Average time per complaint now: about 90 seconds.

The point isn't that AI is magic. The point is that the agent did exactly what the ops person used to do, using exactly the same backend code, because we exposed it as MCP instead of forcing the human to click through a UI. That's the leverage.

Practical things I'd do differently

A few hard-won lessons from the last six months:

Start read-only. Ship the list_* and get_* tools first. Watch what agents actually call. The shape of the tools you really need usually doesn't match what you predicted. Adding a write tool prematurely means you'll have to either deprecate it or live with it.

Log every tool call with the agent's reasoning when available. The MCP spec now includes optional reasoning metadata in 2026. Capture it. When something goes wrong — and it will — you'll want to know why the agent called delete_project instead of archive_project.

Rate-limit by user, not by IP. One user with one agent open can generate hundreds of tool calls in a session. IP-based limits will either let abuse through or block legitimate use. Use the user ID from the access token.

Don't expose your raw database schema as tools. Tempting, especially with SQL Server or MongoDB sitting right there. Resist. Tools should be domain operations ("create invoice," "refund order"), not CRUD primitives. The agent will compose the wrong sequence of CRUD calls more often than you'd think.

What to build this week

If you have a Next.js app in production and you've been waiting for the right moment to add MCP, that moment was about two months ago. Pick the most-clicked workflow in your app — the thing your users do every day that involves three or four steps in your UI — and expose it as one or two tools. Use the Vercel adapter. Plug in Redis for session storage. Use your existing auth library's MCP integration. You'll have a working remote MCP server in an afternoon.

The teams winning with agentic AI in 2026 aren't the ones with the fanciest models. They're the ones who took the boring step of making their existing systems addressable by agents. MCP is the standard that makes that boring step a one-day project instead of a one-quarter project. Next.js is currently the easiest way I know to host one. Go ship.

Pravin Harchandani

→ Software Architect · Persistent Systems

14+ years building resilient .NET / React systems at scale. Currently writing about AI-augmented engineering and the workflows that ship.

Production MCP Servers in Next.js 16: A Field Guide

Why MCP suddenly matters for Next.js devs

The smallest useful MCP server in Next.js 16

Where the localhost demo breaks in production

1. Stateful sessions vs. stateless infrastructure

2. Auth that the agent can actually do

3. Tools that lie about what they do

A real before/after from our stack

Practical things I'd do differently

What to build this week

Keep reading

Codex Just Got Agentic — Here's What It Means for Your Stack

MCP Servers: Give Claude Code Access to Your Tools

Claude Code Hooks: Automate Quality Gates for AI Code

Production MCP Servers in Next.js 16: A Field Guide

Why MCP suddenly matters for Next.js devs

The smallest useful MCP server in Next.js 16

Where the localhost demo breaks in production

1. Stateful sessions vs. stateless infrastructure

2. Auth that the agent can actually do

3. Tools that lie about what they do

A real before/after from our stack

Practical things I'd do differently

What to build this week

Keep reading

Codex Just Got Agentic — Here's What It Means for Your Stack

MCP Servers: Give Claude Code Access to Your Tools

Claude Code Hooks: Automate Quality Gates for AI Code