Vercel AI SDK integration
Add long-term agent memory to Vercel AI SDK applications
The @getzep/zep-vercel-ai package adds long-term memory to the Vercel AI SDK (v6), backed by Zep’s temporal knowledge graph. It exposes Zep through three layers so you can pick the integration point that fits your call: middleware, helpers, and tools.
Core benefits
- Automatic context injection: Middleware prepends Zep’s context block as a system message on each new user turn
- One write per turn: An
onFinishcallback persists the full turn once, even across a multi-step tool loop - On-demand tools: Let the model search and persist memory explicitly inside a tool loop
- Works with
generateTextandstreamText: The same inject-and-persist pattern applies to both - Graceful degradation: A Zep outage degrades to “no memory” and never crashes the host call
How it works
Inject the context block via middleware and persist the whole turn via onFinish: the tool loop calls the model once per step, so persisting from a per-step hook would fragment one turn across many writes.
The package exposes three layers:
createZepOnFinish fires exactly once per turn with the final assistant text, so persistence lives there.
Installation
Requires Node.js 20+, ai>=6 (the Vercel AI SDK v6; not compatible with v5), zod 3 or 4, @getzep/zep-cloud>=3.23.0, and a Zep Cloud API key. You’ll also want a model provider such as @ai-sdk/openai. Get your API key from app.getzep.com.
Set up your environment variables:
Usage with generateText
Provision the Zep user and thread, wrap the model to inject context, optionally add tools, and persist the turn via onFinish:
If your OpenAI organization enforces Zero Data Retention, use openai.chat('gpt-4o-mini') (Chat Completions API) instead of openai('gpt-4o-mini'). The Responses API references server-persisted item IDs across a multi-step tool loop, which ZDR organizations reject. This is an OpenAI account constraint, not a Zep issue.
Usage with streamText
The same pattern works unchanged for streaming — inject via middleware, persist via onFinish:
To set the system prompt yourself instead of using the middleware, fetch the block with getZepContext and persist with persistZepTurn (or createZepOnFinish) directly.
The layers in detail
createZepMiddleware
Returns a Vercel AI SDK LanguageModelMiddleware for wrapLanguageModel. Injection only — it does not persist. Its transformParams fetches the context block and prepends it as a system message, but only on a genuine new user turn (detected by the last prompt message being a user message). On tool-loop continuation steps it injects nothing, so the block is fetched at most once per turn. Options include formatContext, templateId, and logger.
createZepOnFinish
Returns an onFinish callback that persists the whole turn once — the user’s input plus the final assistant text — via thread.addMessages. Because onFinish fires exactly once per turn for both generateText and streamText, it records exactly one user message and one assistant message and never writes intermediate tool-call preamble. Supply the user side via user (a string or a (event) => string resolver); the assistant side is taken from event.text.
getZepContext and persistZepTurn
Plain async functions with no framework coupling. getZepContext returns the prompt-ready context block string. persistZepTurn writes a { user?, assistant? } turn; pass { returnContext: true } to fold persist and retrieval into one round-trip.
createZepTools
Returns { zepSearch, zepRemember, zepContext } built with the AI SDK’s tool() and Zod schemas. Spread them into a generateText / streamText tools record so the model decides when to retrieve or persist. Each tool is also exported as a standalone factory (createZepSearchTool, createZepRememberTool, createZepContextTool).
The tools return typed results: zepSearch → { facts: string[], found: boolean }, zepRemember → { stored: boolean, message: string }, and zepContext → { context: string, found: boolean }. zepSearch defaults to the edges scope (facts/relationships) — the most useful scope for an agent recalling discrete claims — and its facts are extracted strings tailored to the bound scope (edge facts, "name: summary" for entities, episode content, and so on).
Binding: user graph vs standalone graph
createZepTools is bound to a graph via a ZepBinding:
userIdtargets a user graph — the home for personalized agent memory.zepContextand the middleware also need athreadId(the thread scopes relevance; retrieval still spans the whole user graph).graphIdtargets a standalone graph — shared or domain knowledge such as a product knowledge base or runbooks.
If both are set, userId wins. If neither is set, tools return a graceful “not configured” result instead of throwing.
Best practices
- Inject via middleware, persist via
onFinish— this records exactly one user and one assistant message per turn - Call
ensureZepUserAndThreadonce before the first turn, then reuse a singleZepClient - Use AI SDK v6 — this package is not compatible with v5
- Don’t read-after-write within a turn — Zep builds the graph asynchronously, so a just-stored fact is not instantly retrievable
Next steps
- Explore customizing graph structure for advanced knowledge organization
- Learn about searching the graph and how to tune search
- See code examples for additional patterns