Zep is a context engineering platform that delivers the right contextual information to your AI agents at the right time. Through a temporal knowledge graph, Zep assembles relevant context from chat history, business data, and user behavior—enabling agents to make better decisions with accurate, up-to-date information. With a simple three-line API and sub-200ms retrieval latency, Zep helps you build personalized, reliable AI agents without extensive context pipeline engineering.
Get started with the example in the video using:
This guide shows you how to integrate Zep into your AI application to provide personalized context for every user interaction. You’ll learn how to ingest user messages and business data, then retrieve assembled context that includes user preferences, traits, and relevant facts—all optimized for your LLM’s context window.
Looking for a more in-depth understanding? Check out our Key Concepts page.
Migrating from Mem0? Check out our Mem0 Migration guide.
Set up your Python project, ideally with a virtual environment, and then:
After creating a Zep account, obtaining an API key, and setting the API key as an environment variable, initialize the client once at application startup and reuse it throughout your application.
Whenever users are created in your application, you need to trigger the creation of a Zep user. Make sure to include at least their first name, and ideally also their last name and email to ensure correct identification of the user in future messages. We recommend setting the Zep user ID equal to your internal user ID.
Backfilling existing users: For existing users, you will need to run a one-time migration to create a user for each of the existing users (simply loop through and call user.add for each).
Provide at least the first name and ideally the last name when calling user.add to ensure Zep correctly associates the user with references in your data. If needed, add this information later using the update user method.
Whenever a user starts a new conversation with your agent, you need to trigger the creation of a Zep thread. Learn more about adding messages.
Backfilling prior conversations: For prior conversations, you will need to run a one-time migration to create Zep threads for those conversations and add the prior messages to the respective Zep threads. For larger backfills, use the Batch API to ingest historical messages efficiently.
When a new user message comes in, add the user message to Zep, providing the user’s name in the message if possible.
It is important to provide the name of the user in the name field if possible, to help with graph construction.
Include the created_at timestamp (RFC3339 format) representing when the message was originally sent. This ensures accurate temporal understanding in the knowledge graph. See Setting message timestamps for more details.
Beyond chat messages, you can provide Zep with additional context about your users by sending business data directly to their knowledge graphs. This includes user interactions with your application, transactions, support tickets, emails, transcripts—essentially any information that gives context about the user and can be represented as text.
Use the graph.add method to send structured, semi-structured, or unstructured text data to Zep. Include a reference to the user—their full name, user ID, or both—so Zep can correctly associate the data with the user in their knowledge graph. Read more about adding business data.
Any text can be sent to Zep—structured JSON, semi-structured logs, or plain text descriptions. The example below shows a JSON event, but you could also send "User Jane Smith listened to 'Bohemian Rhapsody' by Queen" as plain text. See Adding business data for more data type options.
After adding the user message to the thread and before generating the AI response, retrieve the Zep context block, which will contain the most relevant information to the user’s message from the user’s knowledge graph.
Zep’s default Context Block is an optimized, automatically assembled string that combines semantic search, full text search, and breadth first search to return context that is highly relevant to the user’s current conversation slice, utilizing the past two messages.
The Context Block provides low latency (P95 < 200ms) while preserving detailed information from the user’s graph.
The Context Block includes the user summary along with the most relevant context types from the user’s graph (facts, entities, and episodes by default). The example below shows a Context Block with a user summary and facts:
Using custom context templates, you can easily design your own custom context block type and retrieve that from the thread.get_user_context() method instead.
Create your custom context template for your Zep project and save the template ID. See the Context Templates guide for more information on template syntax and variables.
Retrieve your custom context block using the thread.get_user_context() method, passing in your template ID.
As outlined in our retrieval philosophy, Zep optimizes for high recall over precision, meaning we err on the side of including more results even if some are less relevant. Most agents will automatically reference only the most relevant information when responding to the user message.
Once you’ve retrieved the Context Block, you can include this string in your agent’s context window.
You can append the context block directly to your system prompt. Note that this means the system prompt dynamically updates on every chat turn.
Dynamically updating the system prompt on every chat turn has the downside of preventing prompt caching with LLM providers. In order to reap the benefits of prompt caching while still adding a new Zep context block in every chat, you can append the context block as a “context message” (technically a tool message) just after the user message in the chat history. On each new chat turn, remove the prior context message and replace it with the new one. This allows everything before the context message to be cached.
After generating the assistant response, add it to Zep to continue building the user’s knowledge graph.
Now that you’ve integrated Zep into your application, you can explore additional features: