Key Concepts
Understanding Zep’s Memory, Knowledge Graph, and Data Integration.
Looking to just get coding? Install a Zep SDK and build a simple chatbot.
Zep is a memory layer for AI assistants and agents that continuously learns from user interactions and changing business data. Zep ensures that your Agent has a complete and holistic view of the user, enabling you to build more personalized and accurate user experiences.
Using user chat histories and business data, Zep automatically constructs a knowledge graph for each of your users. The knowledge graph contains entities, relationships, and facts related to your user. As facts change or are superseded, Zep updates the graph to reflect their new state. Using Zep, you can build prompts that provide your agent with the information it needs to personalize responses and solve problems. Ensuring your prompts have the right information reduces hallucinations, improves recall, and reduces the cost of LLM calls.
This guide covers key concepts for using Zep effectively:
- How Zep fits into your application
- The Zep Knowledge Graph
- User and Group graphs
- Managing changes in facts over time
- Business data vs chat message data
- Chat sessions
- Adding memory
- Retrieving memory
- Building a prompt
- Using Zep as an agentic tool
- Additional Zep features
How Zep fits into your application
Your application sends Zep messages and other interactions your agent has with a human. Zep can also ingest data from your business sources in JSON, text, or chat message format. These sources may include CRM applications, emails, billing data, or conversations on other communication platforms like Slack.
Zep fuses this data together on a knowledge graph, building a holistic view of the user’s world and the relationships between entities. Zep offers a number of APIs for adding and retrieving memory. In addition to populating a prompt with Zep’s memory, Zep’s search APIs can be used to build agentic tools.
The example below shows Zep’s memory.context
field resulting from a call to memory.get()
. This is an opinionated, easy to use context string that can be added to your prompt and contains facts and graph entities relevant to the current conversation with a user. For more about the temporal context of facts, see Managing changes in facts over time.
Zep also returns a number of other artifacts in the memory.get()
response, including raw facts
objects. Zep’s search methods can also be used to retrieve nodes, edges, and facts.
Memory Context
Memory context is a string containing relevant facts and entities for the session. It is always present in the result of memory.get()
call and can be optionally received with the response of memory.add()
call.
Python
You can then include this context in your system prompt:
The Knowledge Graph
A knowledge graph is a network of interconnected facts, such as “Kendra loves Adidas shoes.” Each fact is a “triplet” represented by two entities, or nodes (”Kendra”, “Adidas shoes”), and their relationship, or edge (”loves”).
Knowledge Graphs have been explored extensively for information retrieval. What makes Zep unique is its ability to autonomously build a knowledge graph while handling changing relationships and maintaining historical context.
Zep automatically constructs a knowledge graph for each of your users. The knowledge graph contains entities, relationships, and facts related to your user, while automatically handling changing relationships and facts.
Here’s an example of how Zep might extract graph data from a chat message, and then update the graph once new information is available:
Each node and edge contains certain attributes - notably, a fact is always stored as an edge attribute. There are also datetime attributes for when the fact becomes valid and when it becomes invalid.
User vs Group graphs
Zep automatically creates a knowledge graph for each User of your application. You as the developer can also create a “group graph” for memory to be used by a group of Users.
For example, you could create a group graph for your company’s product information or even messages related to a group chat. This avoids having to add the same data to each user graph. To do so, you’d use the graph.add()
and graph.search()
methods (see Retrieving memory).
Group knowledge is not retrieved via the memory.get()
method and is not included in the memory.context
string. To use user and group graphs simultaneously, you need to add group-specific context to your prompt alongside the memory.context
string.
Managing changes in facts over time
When incorporating new data, Zep looks for existing nodes and edges in graph and decides whether to add new nodes/edges or to update existing ones. An update could mean updating an edge (for example, indicating the previous fact is no longer valid).
For example, in the animation above, Kendra initially loves Adidas shoes. She later is angry that the shoes broke and states a preference for Puma shoes. As a result, Zep invalidates the fact that Kendra loves Adidas shoes and creates two new facts: “Kendra’s Adidas shoes broke” and “Kendra likes Puma shoes”.
Zep also looks for dates in all ingested data, such as the timestamp on a chat message or an article’s publication date, informing how Zep sets the following edge attributes. This assists your agent in reasoning with time.
The valid_at
and invalid_at
attributes for each fact are then included in the memory.context
string which is given to your agent:
Business data vs Chat Message data
Zep can ingest either unstructured text (e.g. documents, articles, chat messages) or JSON data (e.g. business data, or any other form of structured data). Conversational data is ingested through memory.add()
in structured chat message format, and all other data is ingested through the graph.add()
method.
Users and Chat Sessions
A Session is a series of chat messages (e.g., between a user and your agent). Users may have multiple Sessions.
Entities, relationships, and facts are extracted from the messages in a Session and added to the user’s knowledge graph. All of a user’s Sessions contribute to a single, shared knowledge graph for that user.
SessionIDs
are arbitrary identifiers that you can map to relevant business objects in your app, such as users or a
conversation a user might have with your app.
Python
TypeScript
The Session Guide contains more information on working with Sessions, including how to delete a Session.
Adding Memory
There are two ways to add data to Zep: memory.add()
and graph.add()
.
Using memory.add()
Add your chat history to Zep using the memory.add()
method. memory.add
is session-specific and expects data in chat message format, including a role
name (e.g., user’s real name), role_type
(AI, human, tool), and message content
. Zep stores the chat history and builds a user-level knowledge graph from the messages.
For best results, add chat history to Zep on every chat turn. That is, add both the AI and human messages in a single operation and in the order that the messages were created.
The example below adds messages to Zep’s memory for the user in the given session:
Python
TypeScript
Go
Include context in the add memory response
For latency-sensitive applications, you can request the memory context directly in the response to the memory.add()
call.
This optimization eliminates the need for a separate memory.get()
if you happen to only need the context.
Read more about Memory Context.
In this scenario you can pass in the return_context=True
flag to the memory.add()
method.
Zep will perform a user graph search right after persisting the memory and return the context relevant to the recently added memory.
Python
TypeScript
Go
Using graph.add()
The graph.add()
method enables you to add business data as a JSON object or unstructured text. It also supports creating Group graphs by passing in a group_id
as opposed to a user_id
.
The example below adds JSON business data to Zep’s memory for the given user:
Python
TypeScript
Retrieving memory
There are four ways to retrieve memory from Zep: memory.get()
, memory.search_sessions()
, graph.search()
, and methods for retrieving specific nodes, edges, or episodes using UUIDs.
Using memory.get()
The memory.get()
method is a user-friendly, high-level API for retrieving relevant context from Zep. It uses the latest messages of the given session to generate a context string for your prompt. It also returns recent chat messages and raw facts that may provide additional context for your agent. memory.get
is user and session-specific and cannot retrieve data from group graphs.
The example below gets the memory.context
string for the given session:
Python
TypeScript
Go
Using memory.search_sessions()
The memory.search_sessions()
is a convenience method for searching user and session-specific facts and chat messages. The method returns a list of either facts or messages, depending on the search scope.
The search_sessions()
method returns relevant facts from the graph, regardless of whether the source is chat history or business data.
The example below searches a user’s facts using the query text:
Python
TypeScript
Go
Using graph.search()
The graph.search()
method lets you search the graph directly, returning raw edges and/or nodes, as opposed to facts. You can customize search parameters, such as the reranker used. For more on how search works, visit the Graph Search guide. This method works for both User and Group graphs.
The example below returns the most relevant edges based on the query text. Note that the search scope defaults to edges.
Python
TypeScript
Retrieving specific nodes, edges, and episodes
Zep offers several utility methods for retrieving specific nodes, edges, or episodes by UUID, or all elements for a user or group. See the Graph API reference for more.
Example:
Python
TypeScript
Using Zep as an agentic tool
Zep’s memory retrieval methods can be used as agentic tools, enabling your agent to query Zep for relevant information.
In the example below, a LangChain LangGraph tool is created to search for facts in a user’s graph.
Other Zep Features
Additionally, Zep builds on Zep’s memory layer with tools to help you build more deterministic and accurate applications:
- Dialog Classification is a flexible low-latency API for understanding intent, segmenting users, determining the state of a conversation and more, allowing you to select appropriate prompts and models, and manage application flow.
- Structured Data Extraction extracts data from conversations with high-fidelity and low-latency, enabling you to confidently populate your data store, call third-party applications, and build custom workflows.