Quick Start Guide | Zep Documentation

Zep is a context engineering platform that delivers the right contextual information to your AI agents at the right time. Through a temporal knowledge graph, Zep assembles relevant context from chat history, business data, and user behavior—enabling agents to make better decisions with accurate, up-to-date information. With a simple three-line API and sub-200ms retrieval latency, Zep helps you build personalized, reliable AI agents without extensive context pipeline engineering.

Get started with the example in the video using:

$ git clone https://github.com/getzep/zep.git
$ cd zep/examples/python/agent-memory-full-example

This guide shows you how to integrate Zep into your AI application to provide personalized context for every user interaction. You’ll learn how to ingest user messages and business data, then retrieve assembled context that includes user preferences, traits, and relevant facts—all optimized for your LLM’s context window.

Looking for a more in-depth understanding? Check out our Key Concepts page.

Migrating from Mem0? Check out our Mem0 Migration guide.

Install the Zep SDK

Python

TypeScript

Go

Set up your Python project, ideally with a virtual environment, and then:

$ pip install zep-cloud

Initialize the Zep client

After creating a Zep account, obtaining an API key, and setting the API key as an environment variable, initialize the client once at application startup and reuse it throughout your application.

Initialize Zep client

.env

1 import os
2 from zep_cloud.client import Zep
3 
4 API_KEY = os.environ.get('ZEP_API_KEY')
5 
6 client = Zep(
7     api_key=API_KEY,
8 )

Create a Zep user for each of your users

Whenever users are created in your application, you need to trigger the creation of a Zep user. Make sure to include at least their first name, and ideally also their last name and email to ensure correct identification of the user in future messages. We recommend setting the Zep user ID equal to your internal user ID.

Backfilling existing users: For existing users, you will need to run a one-time migration to create a user for each of the existing users (simply loop through and call user.add for each).

Provide at least the first name and ideally the last name when calling user.add to ensure Zep correctly associates the user with references in your data. If needed, add this information later using the update user method.

1 from zep_cloud.client import Zep
2 
3 client = Zep(api_key=API_KEY)
4 
5 # You can choose any user ID, but we recommend using your internal user ID
6 user_id = "your_internal_user_id"
7 
8 new_user = client.user.add(
9     user_id=user_id,
10     email="[email protected]",
11     first_name="Jane",
12     last_name="Smith",
13 )

Create a Zep thread for each of your threads

Whenever a user starts a new conversation with your agent, you need to trigger the creation of a Zep thread. Learn more about adding messages.

Backfilling prior conversations: For prior conversations, you will need to run a one-time migration to create Zep threads for those conversations and add the prior messages to the respective Zep threads. You can loop through messages and call thread.add_messages for each, or use our batch processing method for faster concurrent processing.

1 client = Zep(
2     api_key=API_KEY,
3 )
4 thread_id = uuid.uuid4().hex # A new thread identifier
5 
6 client.thread.create(
7     thread_id=thread_id,
8     user_id=user_id,
9 )

Add incoming user messages to Zep

When a new user message comes in, add the user message to Zep, providing the user’s name in the message if possible.

It is important to provide the name of the user in the name field if possible, to help with graph construction.

Include the created_at timestamp (RFC3339 format) representing when the message was originally sent. This ensures accurate temporal understanding in the knowledge graph. See Setting message timestamps for more details.

1 from zep_cloud.client import Zep
2 from zep_cloud.types import Message
3 from datetime import datetime, timezone
4 
5 zep_client = Zep(
6     api_key=API_KEY,
7 )
8 
9 messages = [
10     Message(
11         created_at=datetime.now(timezone.utc).isoformat(),
12         name="Jane Smith",
13         role="user",
14         content="Who was Octavia Butler?",
15     )
16 ]
17 
18 response = zep_client.thread.add_messages(thread_id, messages=messages)

Add streaming business data to Zep

Beyond chat messages, you can provide Zep with additional context about your users by sending business data directly to their knowledge graphs. This includes user interactions with your application, transactions, support tickets, emails, transcripts—essentially any information that gives context about the user and can be represented as text.

Use the graph.add method to send structured, semi-structured, or unstructured text data to Zep. Include a reference to the user—their full name, user ID, or both—so Zep can correctly associate the data with the user in their knowledge graph. Read more about adding business data.

Any text can be sent to Zep—structured JSON, semi-structured logs, or plain text descriptions. The example below shows a JSON event, but you could also send "User Jane Smith listened to 'Bohemian Rhapsody' by Queen" as plain text. See Adding business data for more data type options.

1 from zep_cloud.client import Zep
2 import json
3 
4 client = Zep(api_key=API_KEY)
5 
6 # Example: User listened to a song in your application
7 event_data = {
8     "user_id": "user123",
9     "user_name": "Jane Smith",
10     "event_type": "song_played",
11     "song_title": "Bohemian Rhapsody",
12     "artist": "Queen",
13     "duration_seconds": 354
14 }
15 
16 client.graph.add(
17     user_id="user123",
18     type="json",
19     data=json.dumps(event_data)
20 )

Retrieve Zep context block

After adding the user message to the thread and before generating the AI response, retrieve the Zep context block, which will contain the most relevant information to the user’s message from the user’s knowledge graph.

Use the default context block

Zep’s default Context Block is an optimized, automatically assembled string that combines semantic search, full text search, and breadth first search to return context that is highly relevant to the user’s current conversation slice, utilizing the past two messages.

The Context Block provides low latency (P95 < 200ms) while preserving detailed information from the user’s graph.

1 # Get context for the thread
2 user_context = client.thread.get_user_context(thread_id=thread_id)
3 
4 # Access the context block (for use in prompts)
5 context_block = user_context.context
6 print(context_block)

The Context Block includes a user summary and relevant facts:

# This is the user summary
<USER_SUMMARY>
Emily Painter is a user with account ID Emily0e62 who uses digital art tools for creative work. She maintains an active account with the service, though has recently experienced technical issues with the Magic Pen Tool. Emily values reliable payment processing and seeks prompt resolution for account-related issues. She expects clear communication and efficient support when troubleshooting technical problems.
</USER_SUMMARY>
# These are the most relevant facts and their valid date ranges
# format: FACT (Date range: from - to)
<FACTS>
  - Emily is experiencing issues with logging in. (2024-11-14 02:13:19+00:00 - present)
  - User account Emily0e62 has a suspended status due to payment failure. (2024-11-14 02:03:58+00:00 - present)
  - user has the id of Emily0e62 (2024-11-14 02:03:54 - present)
  - The failed transaction used a card with last four digits 1234. (2024-09-15 00:00:00+00:00 - present)
  - The reason for the transaction failure was 'Card expired'. (2024-09-15 00:00:00+00:00 - present)
  - user has the name of Emily Painter (2024-11-14 02:03:54 - present)
  - Account Emily0e62 made a failed transaction of 99.99. (2024-07-30 00:00:00+00:00 - 2024-08-30 00:00:00+00:00)
</FACTS>

Use a custom context block

Using custom context templates, you can easily design your own custom context block type and retrieve that from the thread.get_user_context() method instead.

Create your custom context template

Create your custom context template for your Zep project and save the template ID. See the Context Templates guide for more information on template syntax and variables.

1 from zep_cloud import Zep
2 
3 client = Zep(api_key="YOUR_API_KEY")
4 
5 client.context.create_context_template(
6     template_id="customer-support",
7     template="""# CUSTOMER PROFILE
8 %{user_summary}
9 
10 # RECENT INTERACTIONS
11 %{edges limit=10}
12 
13 # KEY ENTITIES
14 %{entities limit=5}"""
15 )

Retrieve custom context block using thread.get_user_context()

Retrieve your custom context block using the thread.get_user_context() method, passing in your template ID.

1 from zep_cloud import Zep
2 
3 client = Zep(api_key="YOUR_API_KEY")
4 
5 user_context = client.thread.get_user_context(
6     thread_id="thread_id",
7     template_id="customer-support"
8 )
9 context_block = user_context.context

Add context block to agent context window

As outlined in our retrieval philosophy, Zep optimizes for high recall over precision, meaning we err on the side of including more results even if some are less relevant. Most agents will automatically reference only the most relevant information when responding to the user message.

Once you’ve retrieved the Context Block, you can include this string in your agent’s context window.

Option 1: Add context block to system prompt

You can append the context block directly to your system prompt. Note that this means the system prompt dynamically updates on every chat turn.

MessageType	Content
`System`	Your system prompt `{Zep context block}`
`Assistant`	An assistant message stored in Zep
`User`	A user message stored in Zep
…	…
`User`	The latest user message

Option 2: Append context block as “context message”

Dynamically updating the system prompt on every chat turn has the downside of preventing prompt caching with LLM providers. In order to reap the benefits of prompt caching while still adding a new Zep context block in every chat, you can append the context block as a “context message” (technically a tool message) just after the user message in the chat history. On each new chat turn, remove the prior context message and replace it with the new one. This allows everything before the context message to be cached.

MessageType	Content
`System`	Your system prompt (static, cacheable)
`Assistant`	An assistant message stored in Zep
`User`	A user message stored in Zep
…	…
`User`	The latest user message
`Tool`	`{Zep context block}`

Add assistant response to Zep

After generating the assistant response, add it to Zep to continue building the user’s knowledge graph.

1 from zep_cloud.types import Message
2 from datetime import datetime, timezone
3 
4 messages = [
5     Message(
6         created_at=datetime.now(timezone.utc).isoformat(),
7         name="AI Assistant",
8         role="assistant",
9         content="Octavia Butler was an influential American science fiction writer...",
10     )
11 ]
12 
13 response = zep_client.thread.add_messages(thread_id, messages=messages)

Next steps

Now that you’ve integrated Zep into your application, you can explore additional features:

Customize graph structure to your domain - Define custom entity and edge types to structure domain-specific information.
Add user interactions and metadata - Any ongoing user interactions or one-time user profile information can be added to the user’s knowledge graph.
Custom context templates - Design custom context block formats tailored to your application’s needs.
User summary instructions - Customize how Zep generates summaries of user data in their knowledge graph.