The happy-path demo always looks simple:
That is a fine prototype. It is not yet a production integration.
What changes in production is not the number of lines of code. It is the number of decisions you now own:
The Vercel AI SDK team has made this category easier to adopt. The official AI SDK 6 release introduced stable MCP support in @ai-sdk/mcp, and the current docs describe MCP support for tools, resources, and prompts. Those are meaningful improvements because teams no longer have to invent a thin adapter layer from scratch. See the official AI SDK 6 announcement and the current AI SDK MCP documentation.
That progress is real. It does not remove the need for integration discipline.
For most teams, a clean AI SDK + MCP setup looks like this:
user request
-> agent runtime in your app
-> workflow-specific tool policy
-> MCP client
-> one or more MCP servers
-> external systems or governed context
-> result assembly
-> logs, approvals, and traces
The line many teams skip is this one:
workflow-specific tool policy
Do not hand the full discovered tool catalog to every run and hope the model behaves. Discovery is a protocol feature. Exposure is an application decision.
If your AI SDK can enumerate twenty tools from an MCP server, that does not mean the current workflow should see twenty tools.
Treat discovery as the superset and your runtime allowlist as the actual execution boundary.
Good question:
"Which three tools does this customer support agent need for this request?"
Bad question:
"Why not let the model see everything and figure it out?"
The current AI SDK documentation recommends HTTP transport for production deployments and explicitly notes that stdio is appropriate for local servers rather than production deployment. That sounds like a protocol detail, but it changes how you handle firewalls, retries, proxying, and operational ownership. See the official AI SDK MCP docs.
The boring rule is still the right one:
This is where "integration" turns into product design.
The model decides whether to call a tool based partly on the prompt and partly on the tool contract you expose. Vague descriptions produce vague tool use. If two tools overlap, the model burns tokens deciding between confusing options instead of solving the task.
Write tool descriptions like a cautious operator, not like a marketing site:
The model should not be free to:
Budgets need to exist at runtime, not as polite hopes inside a system prompt.
If a tool call produces a bad answer or a bad action, you need more than "the model chose badly."
You need:
Without that, you do not have a production integration. You have a clever black box.
The AI SDK docs describe two useful ways to work with MCP tools: broad schema discovery and explicit schema definition. In practice, the choice maps to a production tradeoff:
| Choice | Why teams like it | Where it breaks |
|---|---|---|
| Load all discovered tools | Fast to prototype, stays in sync with server changes | Too broad for production, weaker type control, easier to overexpose capabilities |
| Explicitly define schemas and tool set | Better type safety, better control, easier review | Slightly more setup work, requires disciplined maintenance |
The right rule for most teams is simple:
The exact API surface depends on the SDK and provider you choose, so remember the pattern more than the syntax:
// illustrative pseudocode, not a copy-paste snippet
const mcpClient = await createMCPClient({
transport: {
type: "http",
url: process.env.MCP_SERVER_URL,
headers: { Authorization: `Bearer ${getScopedToken(user, workflow)}` },
},
});
const discoveredTools = await mcpClient.tools();
const allowedTools = narrowTools(discoveredTools, {
workflow: "customer_support",
permissions: ["ticket.read", "faq.search"],
maxTools: 3,
});
const result = await runAgent({
prompt: userPrompt,
tools: allowedTools,
maxToolCalls: 3,
timeoutMs: 12000,
onToolCall: logToolEvent,
});
return reviewAndAssemble(result);
Three details matter here:
The runtime exposes too many tools, so the model either makes poor choices or burns tokens choosing among near-duplicates.
Fix: build workflow-specific tool bundles.
The agent looks "helpful" in staging because it can update things. Then nobody remembers that the same tool bundle shipped to a broader environment.
Fix: separate read and write tools cleanly, and require approvals for destructive paths.
An MCP tool returns huge blobs because nobody thought about response shape. The model receives too much low-signal data and answer quality gets worse.
Fix: prefer compact, structured payloads over raw dumps.
The SDK retries, the transport retries, the app retries, and suddenly one flaky downstream system receives three times the traffic.
Fix: make one layer responsible for retries and keep the limits visible.
If your agent system needs both governed context and tool calling, puppyone sits in a useful middle position.
The pattern looks like this:
That matters most in production agent systems where tool calling and context delivery are part of the same workflow. A support agent may need a governed policy lookup, a ticket reader, and a human approval step. A finance agent may need a narrower version of the same infrastructure. The integration becomes easier to reason about when the context layer is already scoped and versioned.
If you are doing AI SDK + MCP integration now, ship it in this order:
Then add:
The teams that get into trouble usually invert that order. They start with broad capability and hope policy can be added later.
Plan a safer AI SDK + MCP rollout with puppyoneGet startedNo. MCP gives your SDK a standard way to discover and use external capabilities. You can still keep app-native tools. Many production systems end up using both.
No. Discovery should feed a narrower runtime allowlist. Exposing the entire tool catalog to every request usually hurts reliability more than it helps flexibility.
Start with tool selection quality and latency. If the model keeps choosing the wrong tool or waits too long on tool calls, user-facing quality drops quickly even when the model itself looks fine.