Vercel AI SDK Guide 2026: Build Production AI Apps with TypeScript

Tue, 21 Apr 2026 00:00:00 +0000

The Vercel AI SDK has become the default way to build AI-powered applications in TypeScript. With 11.5 million weekly npm downloads, support for 100+ models across 16 providers, and a growing ecosystem of Workflows and Sandbox tooling, it handles the plumbing so you can focus on your product logic.

This guide covers everything you need to go from zero to a deployed, production-grade AI application. No hype — just the APIs, patterns, and tradeoffs.

What is the Vercel AI SDK? Why It Matters in 2026

The AI SDK is a TypeScript toolkit for building AI applications with streaming, structured output, tool calling, and multi-step agent patterns. It is not a model provider — it is a unified interface that sits between your application code and any LLM provider.

11.5M Weekly Downloads and Growing

The npm download numbers tell the story. AI SDK crossed 11.5 million weekly downloads in April 2026, up from roughly 4 million a year prior. The SDK has 23.7K GitHub stars and 614+ contributors. This is not a niche library. It is the dominant TypeScript AI framework.

The Unified TypeScript AI Layer: Core + UI + RSC

The SDK is organized into three layers:

Layer	Package	Purpose
AI SDK Core	`ai`	Server-side generation: `generateText`, `streamText`, `generateObject`, `streamObject`
AI SDK UI	`ai/react` (and framework equivalents)	Client-side hooks: `useChat`, `useCompletion`, `useObject`
AI SDK RSC	`ai/rsc`	React Server Components integration for streaming UI from the server

This separation matters. Core runs anywhere — Node, Edge, Bun, Deno. UI hooks render in the browser. RSC bridges the two in Next.js App Router. You pick the layer you need and ignore the rest.

How It Fits in the Vercel Ecosystem

The SDK is the foundation, but the ecosystem extends further in 2026:

AI Gateway: One API key to access 100+ models across 16+ providers with built-in load balancing, fallbacks, and rate limiting.
Vercel Sandbox: Secure execution environment for agent-generated code. Useful when your AI agent writes and runs code on behalf of users.
Vercel Workflows: Durable, long-running agent execution that can suspend, resume, and survive function timeouts. New in 2026.
AI Elements: A component library for AI-native UIs (message threads, artifact views, tool result renders). New in 2026.

You do not need any of these to use AI SDK Core. They are opt-in for specific use cases.

Getting Started: Installing and Configuring AI SDK

npm install ai + Provider Packages

npm install ai @ai-sdk/openai @ai-sdk/anthropic @ai-sdk/google

The ai package contains the core functions. Provider packages (@ai-sdk/openai, @ai-sdk/anthropic, @ai-sdk/google) contain model definitions and provider-specific configuration. You install only the providers you use.

Setting Up API Keys

Store provider API keys as environment variables:

# .env.local
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_GENERATIVE_AI_API_KEY=AIza...

Never commit these. In production, use your hosting platform’s secret management.

Project Structure for a Next.js AI App

A minimal structure:

The lib/ai.ts file centralizes provider instantiation:

// lib/ai.ts
import { createOpenAI } from "@ai-sdk/openai";
import { createAnthropic } from "@ai-sdk/anthropic";

export const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

export const anthropic = createAnthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

AI Gateway: One Key for 100+ Models

If you use Vercel AI Gateway, you can replace individual provider keys with a single gateway key:

import { createGateway } from "@ai-sdk/gateway";

const gateway = createGateway({
  apiKey: process.env.AI_GATEWAY_KEY,
});

// Access any model through the gateway
const model = gateway("gpt-4o");
const model2 = gateway("claude-sonnet-4-20250514");

The Gateway handles routing, rate limiting, and fallbacks across providers. This is useful when you want model flexibility without managing multiple API keys.

AI SDK Core: Text Generation and Streaming

generateText() for One-Shot Generation

generateText sends a prompt and waits for the complete response. Use it when you need the full result before proceeding.

import { generateText } from "ai";
import { openai } from "@/lib/ai";

async function summarize(text: string) {
  const { text: summary } = await generateText({
    model: openai("gpt-4o"),
    prompt: `Summarize the following text in 2-3 sentences:\n\n${text}`,
  });

  return summary;
}

The return object includes text, usage (token counts), finishReason, and response (raw provider response).

streamText() for Real-Time Streaming

streamText returns a stream that yields tokens as they arrive. This is the foundation for chat interfaces.

import { streamText } from "ai";
import { openai } from "@/lib/ai";

async function streamResponse(prompt: string) {
  const result = streamText({
    model: openai("gpt-4o"),
    prompt,
  });

  // Consume as a text stream
  for await (const chunk of result.textStream) {
    process.stdout.write(chunk);
  }
}

In a Next.js route handler, you return the stream directly:

// app/api/chat/route.ts
import { streamText } from "ai";
import { openai } from "@/lib/ai";

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: openai("gpt-4o"),
    messages,
  });

  return result.toDataStreamResponse();
}

toDataStreamResponse() converts the stream into the AI SDK data stream protocol — the format that client-side hooks expect.

Provider Switching with One Line of Code

Because models are just configuration objects, switching providers requires changing one argument:

// OpenAI
const result = await generateText({
  model: openai("gpt-4o"),
  prompt: "Explain quantum computing",
});

// Switch to Anthropic — only the model line changes
const result = await generateText({
  model: anthropic("claude-sonnet-4-20250514"),
  prompt: "Explain quantum computing",
});

No other code changes. The prompt format, streaming mechanism, and response handling remain identical.

Built-in Fallbacks and Retry Logic

When a provider fails or rate-limits, you can fall back to an alternative model:

import { generateText } from "ai";
import { openai } from "@/lib/ai";
import { anthropic } from "@/lib/ai";

const { text } = await generateText({
  model: openai("gpt-4o"),
  fallbackModels: [anthropic("claude-sonnet-4-20250514")],
  prompt: "Explain the halting problem",
});

If the primary model fails, the SDK automatically retries with the first fallback. You can configure maxRetries and abortSignal for finer control.

Structured Output with Zod Schemas

Raw LLM text is fine for chat. For application logic, you need structured data. The AI SDK integrates with Zod to enforce type-safe schemas on model output.

generateObject() for Type-Safe JSON Responses

import { generateObject } from "ai";
import { openai } from "@/lib/ai";
import { z } from "zod";

const RecipeSchema = z.object({
  name: z.string(),
  ingredients: z.array(z.object({
    item: z.string(),
    amount: z.string(),
  })),
  steps: z.array(z.string()),
  cookTimeMinutes: z.number(),
});

async function getRecipe(dish: string) {
  const { object } = await generateObject({
    model: openai("gpt-4o"),
    schema: RecipeSchema,
    prompt: `Generate a recipe for ${dish}`,
  });

  // object is fully typed as z.infer
  return object;
}

The return value object has the TypeScript type inferred from the Zod schema. If the model produces output that does not conform, the SDK throws a AI_TypeError.

streamObject() for Streaming Structured Data

When generating large structured objects, streamObject returns partial results as they arrive:

import { streamObject } from "ai";
import { openai } from "@/lib/ai";
import { z } from "zod";

const result = streamObject({
  model: openai("gpt-4o"),
  schema: z.object({
    analysis: z.string(),
    score: z.number(),
    recommendations: z.array(z.string()),
  }),
  prompt: "Analyze the following code for security vulnerabilities: ...",
});

for await (const partialObject of result.partialObjectStream) {
  // partialObject contains the fields that have been generated so far
  console.log(partialObject);
}

Use partialObjectStream for real-time UI updates as the model fills in each field.

Zod Schema Definitions and Validation

The SDK supports most Zod types: strings, numbers, booleans, arrays, objects, enums, unions, and optional fields. It does not support transforms or refinements — those apply after the model output is validated.

// Supported
const schema = z.object({
  status: z.enum(["active", "inactive", "pending"]),
  tags: z.array(z.string()),
  metadata: z.record(z.string(), z.any()).optional(),
});

// Not supported in schema (apply after)
const schema = z.object({
  email: z.string(), // validate format after with .transform()
  score: z.number().min(0).max(100), // .min/.max work but are hints, not hard guards
});

Practical Example: Extracting Structured Data from Unstructured Text

import { generateObject } from "ai";
import { openai } from "@/lib/ai";
import { z } from "zod";

const ContactSchema = z.object({
  name: z.string(),
  email: z.string().optional(),
  phone: z.string().optional(),
  company: z.string().optional(),
  title: z.string().optional(),
});

async function extractContacts(emailText: string) {
  const { object: contacts } = await generateObject({
    model: openai("gpt-4o"),
    schema: z.array(ContactSchema),
    prompt: `Extract all people and their contact information from this email:\n\n${emailText}`,
  });

  return contacts;
}

// Input: "Hi, I'm Sarah Chen (VP Eng at Acme, sarah@acme.co). 
//         CC'd: John Park (john.park@beta.dev)"
// Output: [
//   { name: "Sarah Chen", email: "sarah@acme.co", company: "Acme", title: "VP Eng" },
//   { name: "John Park", email: "john.park@beta.dev", company: undefined, title: undefined }
// ]

No regex. No parsing. The model extracts structured data from freeform text, and the Zod schema guarantees the output shape.

Building Chat UIs with AI SDK UI Hooks

useChat() for Chat Interfaces

useChat manages conversation state, sends messages to your API route, and streams responses back to the UI:

// app/page.tsx
"use client";
import { useChat } from "ai/react";

export default function ChatPage() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
    api: "/api/chat",
  });

  return (
    <div className="max-w-2xl mx-auto p-4">
      <div className="space-y-4 mb-4">
        {messages.map((m) => (
          <div key={m.id} className={m.role === "user" ? "text-right" : "text-left"}>
            <div
              className={`inline-block rounded-lg px-4 py-2 ${
                m.role === "user" ? "bg-blue-600 text-white" : "bg-gray-100 text-gray-900"
              }`}
            >
              {m.content}
            div>
          div>
        ))}
      div>

      <form onSubmit={handleSubmit} className="flex gap-2">
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Ask something..."
          className="flex-1 rounded border px-3 py-2"
          disabled={isLoading}
        />
        <button type="submit" disabled={isLoading} className="rounded bg-blue-600 px-4 py-2 text-white">
          Send
        button>
      form>
    div>
  );
}

The hook handles message history, loading state, error state, and automatic scrolling. The server route from the previous section (/api/chat) handles the model call and streaming.

useCompletion() for Auto-Complete

useCompletion is for single-turn generation patterns — autocomplete, suggestions, summarization — where you do not need conversation history:

"use client";
import { useCompletion } from "ai/react";

export default function Summarizer() {
  const { completion, input, handleInputChange, handleSubmit } = useCompletion({
    api: "/api/completion",
  });

  return (
    <form onSubmit={handleSubmit}>
      <textarea value={input} onChange={handleInputChange} placeholder="Paste text to summarize" />
      <button type="submit">Summarizebutton>
      {completion && <div className="mt-4">{completion}div>}
    form>
  );
}

useObject() for Streaming Structured Responses

When your API route returns a streamed object (via streamObject), use useObject on the client:

"use client";
import { useObject } from "ai/react";
import { z } from "zod";

const AnalysisSchema = z.object({
  summary: z.string(),
  sentiment: z.enum(["positive", "negative", "neutral"]),
  keywords: z.array(z.string()),
});

export default function Analyzer() {
  const { object, submit, isLoading } = useObject({
    api: "/api/analyze",
    schema: AnalysisSchema,
  });

  return (
    <div>
      <button onClick={() => submit({ text: "Product is great but shipping was slow" })} disabled={isLoading}>
        Analyze
      button>
      {object && (
        <div className="mt-4 space-y-2">
          <p><strong>Summary:strong> {object.summary}p>
          <p><strong>Sentiment:strong> {object.sentiment}p>
          <p><strong>Keywords:strong> {object.keywords?.join(", ")}p>
        div>
      )}
    div>
  );
}

Framework Support

The UI hooks are not React-only. AI SDK ships framework-specific packages:

Framework	Package	Hooks
React	`ai/react`	`useChat`, `useCompletion`, `useObject`
Vue	`ai/vue`	`useChat`, `useCompletion`, `useObject`
Svelte	`ai/svelte`	`useChat`, `useCompletion`, `useObject`
Solid	`ai/solid`	`useChat`, `useCompletion`, `useObject`

The server-side streamText and streamObject functions are framework-agnostic. Only the client hooks differ.

Tool Calling: Giving Your AI Agent Superpowers

Tool calling lets the model invoke functions you define. Instead of just generating text, the model can trigger actions — fetch data, run calculations, query databases — then incorporate the results into its response.

Defining Tools with Execute Functions

import { generateText, tool } from "ai";
import { openai } from "@/lib/ai";
import { z } from "zod";

const result = await generateText({
  model: openai("gpt-4o"),
  tools: {
    weather: tool({
      description: "Get current weather for a location",
      parameters: z.object({
        city: z.string(),
        unit: z.enum(["celsius", "fahrenheit"]).optional(),
      }),
      execute: async ({ city, unit = "celsius" }) => {
        const res = await fetch(`https://api.weather.example/current?city=${city}&unit=${unit}`);
        return res.json();
      },
    }),
    calculator: tool({
      description: "Evaluate a mathematical expression",
      parameters: z.object({
        expression: z.string(),
      }),
      execute: async ({ expression }) => {
        // Simple eval for demo — use a safe math parser in production
        const result = Function(`"use strict"; return (${expression})`)();
        return { result };
      },
    }),
  },
  prompt: "What's the weather in Seoul, and what is 15% of 340?",
});

console.log(result.text);
// "The current weather in Seoul is 12°C with partly cloudy skies.
//  15% of 340 is 51."

The model decides which tools to call based on the user’s prompt. You do not hardcode the invocation logic.

Multi-Step Agent Loops with maxSteps

A single model call with tools might not finish the job. The model might need to call a tool, read the result, then decide to call another tool. maxSteps enables this multi-step loop:

import { generateText, tool } from "ai";
import { openai } from "@/lib/ai";
import { z } from "zod";

const result = await generateText({
  model: openai("gpt-4o"),
  maxSteps: 5,
  tools: {
    searchDatabase: tool({
      description: "Search the product database",
      parameters: z.object({
        query: z.string(),
        category: z.string().optional(),
      }),
      execute: async ({ query, category }) => {
        // Simulated database query
        return db.query(query, { category });
      },
    }),
    checkInventory: tool({
      description: "Check inventory for a product ID",
      parameters: z.object({
        productId: z.string(),
      }),
      execute: async ({ productId }) => {
        return inventoryAPI.check(productId);
      },
    }),
  },
  prompt: "Find wireless headphones under $100 and check if the top result is in stock",
});

With maxSteps: 5, the model can make up to 5 tool calls in sequence. After each tool result, it decides whether to call another tool or produce a final text response.

Tool Result Streaming to the Client

When using streamText with tools, tool invocations and results are included in the data stream. On the client, useChat exposes tool state:

"use client";
import { useChat } from "ai/react";

export default function ChatWithTools() {
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    api: "/api/chat",
    maxSteps: 3,
  });

  return (
    <div>
      {messages.map((m) => (
        <div key={m.id}>
          {m.role === "assistant" && m.toolInvocations?.map((invocation) => (
            <div key={invocation.toolCallId} className="bg-yellow-50 p-2 rounded text-sm">
              {invocation.state === "call" && <span>Calling {invocation.toolName}...span>}
              {invocation.state === "result" && (
                <span>Result: {JSON.stringify(invocation.result)}span>
              )}
            div>
          ))}
          {m.content && <div>{m.content}div>}
        div>
      ))}
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
        <button type="submit">Sendbutton>
      form>
    div>
  );
}

Building AI Agents with Multi-Step Reasoning

The Agent Loop Pattern with Tool Calling

An agent is a loop: the model reasons, calls tools, observes results, and repeats until the task is done. The AI SDK handles the loop via maxSteps. Your job is to define the tools and the system prompt.

import { generateText, tool } from "ai";
import { openai } from "@/lib/ai";
import { z } from "zod";

const researchAgent = async (query: string) => {
  const result = await generateText({
    model: openai("gpt-4o"),
    maxSteps: 10,
    system: `You are a research assistant. Use the available tools to find, verify, and summarize information.
    Always cite your sources. If you cannot find a definitive answer, say so explicitly.`,
    tools: {
      webSearch: tool({
        description: "Search the web for information",
        parameters: z.object({ query: z.string() }),
        execute: async ({ query }) => {
          const results = await searchAPI.search(query);
          return results.slice(0, 5);
        },
      }),
      getPageContent: tool({
        description: "Fetch the content of a web page",
        parameters: z.object({ url: z.string() }),
        execute: async ({ url }) => {
          const resp = await fetch(url);
          return resp.text().slice(0, 5000); // Truncate for context limits
        },
      }),
    },
    prompt: query,
  });

  return result.text;
};

Memory and Conversation History Management

For multi-turn agents, you pass the full message history on each request:

// app/api/agent/route.ts
import { streamText } from "ai";
import { openai } from "@/lib/ai";

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: openai("gpt-4o"),
    maxSteps: 5,
    system: "You are a helpful coding assistant with access to a codebase search tool.",
    messages, // Full conversation history
    tools: {
      searchCode: tool({
        description: "Search the codebase for code matching a query",
        parameters: z.object({ query: z.string() }),
        execute: async ({ query }) => codebaseIndex.search(query),
      }),
    },
  });

  return result.toDataStreamResponse();
}

Manage history size to stay within context windows. A common pattern: keep the system prompt + the last N messages, or implement a summarization step for older messages.

function truncateMessages(messages: Message[], maxMessages: number = 20): Message[] {
  if (messages.length <= maxMessages) return messages;
  // Keep system message (if any) + last N messages
  const systemMessages = messages.filter((m) => m.role === "system");
  const nonSystem = messages.filter((m) => m.role !== "system");
  return [...systemMessages, ...nonSystem.slice(-maxMessages)];
}

RAG Integration Pattern

Retrieval-Augmented Generation injects relevant context into the prompt before the model responds. With tool calling, the model can decide when to retrieve:

import { generateText, tool } from "ai";
import { openai } from "@/lib/ai";
import { z } from "zod";

const ragAgent = async (query: string) => {
  const result = await generateText({
    model: openai("gpt-4o"),
    maxSteps: 3,
    system: "Answer questions based on the documentation. Use the search tool to find relevant docs before answering.",
    tools: {
      searchDocs: tool({
        description: "Search the product documentation",
        parameters: z.object({ query: z.string() }),
        execute: async ({ query }) => {
          const results = await vectorStore.search(query, { topK: 5 });
          return results.map((r) => ({ content: r.text, source: r.metadata.source }));
        },
      }),
    },
    prompt: query,
  });

  return result.text;
};

The alternative is “always-retrieve” — fetch context on every request and inject it into the system prompt. The tool-calling approach is more efficient because the model retrieves only when needed.

Vercel Workflows: Long-Running Agents That Survive

What Are Vercel Workflows and When to Use Them

Serverless functions have timeout limits — 10 seconds on Vercel Hobby, 300 seconds on Pro. An agent that needs to call multiple tools, wait for external APIs, or generate long content can easily exceed these limits.

Vercel Workflows solve this by providing durable execution. A workflow can:

Suspend execution and resume when an event occurs (user approval, external API callback)
Survive function restarts — state is persisted, not lost
Run for minutes or hours, not seconds

Use Workflows when your agent needs to:

Wait for human approval before acting
Chain many tool calls that exceed function timeout
Process long-running tasks (content generation pipelines, data analysis)

Suspend, Resume, and Survive Timeouts

import { workflow } from "@vercel/orchestration";
import { generateText, tool } from "ai";
import { openai } from "@/lib/ai";
import { z } from "zod";

const contentPipeline = workflow.define({
  id: "content-generation",
  timeout: "1h",
  async run(ctx) {
    // Step 1: Generate outline
    const { object: outline } = await generateObject({
      model: openai("gpt-4o"),
      schema: z.object({
        title: z.string(),
        sections: z.array(z.object({
          heading: z.string(),
          keyPoints: z.array(z.string()),
        })),
      }),
      prompt: ctx.request.topic,
    });

    // Step 2: Suspend for human review
    const approval = await ctx.waitForEvent("approval", {
      timeout: "24h",
      data: { outline },
    });

    if (!approval.approved) {
      return { status: "rejected", reason: approval.reason };
    }

    // Step 3: Generate full content from approved outline
    const sections = [];
    for (const section of outline.sections) {
      const { text } = await generateText({
        model: openai("gpt-4o"),
        prompt: `Write a detailed section based on: ${JSON.stringify(section)}`,
      });
      sections.push({ heading: section.heading, content: text });
    }

    return { status: "complete", content: { title: outline.title, sections } };
  },
});

Building Durable AI Agents

The key difference between a standard maxSteps agent and a workflow agent is durability. A maxSteps agent runs in a single function invocation. If the function is interrupted, the agent loses all progress. A workflow agent persists state at each step. If the function restarts, the workflow picks up where it left off.

This makes Workflows appropriate for:

Content generation pipelines with human review steps
Data analysis agents that run queries over minutes
Multi-agent systems where one agent delegates to another and waits for results
Any agent that interacts with asynchronous external systems

Production Deployment and Scaling

Deploying to Vercel vs Self-Hosting

Aspect	Vercel	Self-Hosted (Node/Docker)
Streaming	Native (Edge + Node)	Requires proper streaming server setup
Cold starts	Edge: ~50ms, Serverless: ~200ms	Depends on your infrastructure
Timeouts	10s (Hobby), 300s (Pro), Workflows for longer	No limit (configure yourself)
AI Gateway	Built-in	Self-host or skip
Sandbox	Built-in	Bring your own sandbox
Scaling	Automatic	Manual or Kubernetes

For many teams, Vercel is the fastest path to production. The Edge runtime handles streaming with minimal latency. Self-hosting gives you more control over timeouts and compute resources.

Edge Runtime Considerations

AI SDK runs on the Edge runtime. This means:

No Node.js-specific APIs (fs, child_process, native modules) in edge functions
Tool execute functions that depend on Node APIs must run on the Node runtime
You can mix: stream on Edge, execute tools on Node

// If your tools need Node.js APIs, use the Node runtime
export const runtime = "nodejs"; // not "edge"

export async function POST(req: Request) {
  // This function can use fs, child_process, etc.
  const result = streamText({
    model: openai("gpt-4o"),
    messages: await req.json(),
    tools: { /* node-dependent tools */ },
  });
  return result.toDataStreamResponse();
}

Rate Limiting and Cost Management

LLM API calls cost money. Without limits, a single user can rack up significant charges. Implement rate limiting at the API route level:

import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(10, "1m"), // 10 requests per minute
});

export async function POST(req: Request) {
  const ip = req.headers.get("x-forwarded-for") ?? "anonymous";
  const { success } = await ratelimit.limit(ip);

  if (!success) {
    return new Response("Rate limit exceeded", { status: 429 });
  }

  // ... AI SDK call
}

Track token usage from the result object to monitor costs:

const result = await generateText({
  model: openai("gpt-4o"),
  prompt: "...",
});

console.log(result.usage);
// { promptTokens: 124, completionTokens: 89, totalTokens: 213 }

Log this to your observability system. Vercel logs capture request metadata automatically. For custom tracking, send result.usage to your analytics endpoint.

Monitoring with Vercel Logs and Observability

The AI SDK supports OpenTelemetry for distributed tracing. In Vercel, request-level logs are available in the dashboard. For deeper observability:

import { generateText } from "ai";
import { openai } from "@/lib/ai";

const result = await generateText({
  model: openai("gpt-4o"),
  prompt: "...",
  experimental_telemetry: {
    isEnabled: true,
    functionId: "chat-completion",
    metadata: { userId: "user-123" },
  },
});

This exports spans and metrics to your OpenTelemetry collector. You can trace the full lifecycle: request received, model called, tokens streamed, response completed.

AI SDK vs LangChain vs Mastra: Framework Comparison

Three frameworks dominate TypeScript AI development in 2026. Here is how they compare.

Feature Matrix

Feature	AI SDK	LangChain.ts	Mastra
Streaming	First-class, built-in	Supported, requires config	Built-in
Tool calling	Native, Zod-based	Native, Zod-based	Native, Zod-based
Structured output	generateObject/streamObject	StructuredOutputParser	generateObject
Multi-step agents	maxSteps	AgentExecutor	Agent loops
UI hooks	Built-in (React, Vue, Svelte, Solid)	None (bring your own)	Built-in (React)
Provider count	16+ / 100+ models	30+ providers	10+ providers
Durable workflows	Vercel Workflows integration	LangGraph for state machines	Built-in workflow engine
Bundle size	~15KB core	~200KB+	~25KB core
Edge runtime	Yes	Partial	Yes
RSC support	Built-in	None	Partial

Performance: Bundle Size and Latency

AI SDK is significantly lighter than LangChain. The core ai package is roughly 15KB minified. LangChain.ts with its dependency chain can exceed 200KB. This matters for client-side imports and cold start times.

For latency, the overhead added by each framework is negligible compared to model inference time (hundreds of milliseconds to seconds). The real latency difference comes from streaming architecture. AI SDK streams are optimized for Edge runtime with minimal buffering, which translates to lower time-to-first-token on Vercel.

When to Choose Each Framework

Choose AI SDK when	Choose LangChain when	Choose Mastra when
Building on Next.js/Vercel	You need LangGraph’s state machine model	You want an all-in-one agent framework
You want minimal bundle size	You’re porting from LangChain Python	You need built-in RAG pipelines
Streaming is a priority	You need 30+ niche providers	You prefer convention over configuration
You want UI hooks out of the box	Your team already uses LangChain	You want built-in MCP server support
Edge/runtime compatibility matters	You need LangSmith for observability	You want a hosted agent dashboard

The honest answer: for most TypeScript AI apps in 2026, AI SDK is the practical default. LangChain is the right choice for teams that need its larger ecosystem or are migrating from Python. Mastra is worth evaluating if you want a more opinionated framework with built-in infrastructure.

Conclusion and Resources

Key Takeaways

AI SDK Core handles text generation, streaming, structured output, and tool calling with a provider-agnostic API. Switch models by changing one argument.
Structured output with generateObject and Zod schemas gives you type-safe JSON from LLM responses — no more parsing raw text.
UI hooks (useChat, useCompletion, useObject) handle client-side state, streaming, and rendering across React, Vue, Svelte, and Solid.
Tool calling + maxSteps turns a text generator into a multi-step agent. The model decides when and which tools to call.
Vercel Workflows add durability for long-running agents that exceed function timeouts or need human-in-the-loop patterns.
Production readiness means rate limiting, token usage tracking, and runtime selection (Edge vs Node). The SDK gives you the hooks; you add the guardrails.

Official Docs, Templates, and Cookbooks

AI SDK Documentation — the canonical reference for all APIs
AI SDK GitHub — source, issues, and releases
Next.js AI Chatbot Template — production-starter with auth, persistence, and multi-model support
AI SDK Cookbooks — patterns for RAG, agents, and structured output

Community

GitHub Discussions — questions, feature requests, and architecture discussions
Vercel Discord — real-time help from the community and maintainers

What’s Next

Two developments to watch in the second half of 2026:

AI Elements: The new component library for AI-native UIs (message threads, tool result renderers, artifact views). Currently in beta, expected to reach stable by Q3.
Workflows evolution: Vercel Workflows are new. Expect more patterns, better debugging, and tighter AI SDK integration as the API matures.

The AI SDK has reached the point where the core APIs are stable and production-proven. The ecosystem around it — Workflows, Sandbox, AI Elements — is still evolving rapidly. Start with Core, add pieces as your use case demands them.

Structured Output on RockB