LangGraph Memory Example | Zep Documentation

A complete Notebook example of using Zep for LangGraph Memory may be found in the Zep Python SDK Repository.

The following example demonstrates building an agent using LangGraph. Zep is used to personalize agent responses based on information learned from prior conversations.

The agent implements:

persistance of new chat turns to Zep and recall of relevant Facts using the most recent messages.
an in-memory MemorySaver to maintain agent state. We use this to add recent chat history to the agent prompt. As an alternative, you could use Zep for this.

You should consider truncating MemorySaver’s chat history as by default LangGraph state grows unbounded. We’ve included this in our example below. See the LangGraph documentation for insight.

Install dependencies

$ pip install zep-cloud langchain-openai langgraph ipywidgets

Configure Zep

Ensure that you’ve configured the following API key in your environment. We’re using Zep’s Async client here, but we could also use the non-async equivalent.

$ ZEP_API_KEY=

1 from zep_cloud.client import AsyncZep
2 from zep_cloud import Message
3 
4 zep = AsyncZep(api_key=os.environ.get('ZEP_API_KEY'))

1 from langchain_core.messages import AIMessage, SystemMessage, trim_messages
2 from langchain_core.tools import tool
3 from langchain_openai import ChatOpenAI
4 from langgraph.checkpoint.memory import MemorySaver
5 from langgraph.graph import END, START, StateGraph, add_messages
6 from langgraph.prebuilt import ToolNode

Using Zep’s Search as a Tool

These are examples of simple Tools that search Zep for facts (from edges) or nodes.

1 class State(TypedDict):
2     messages: Annotated[list, add_messages]
3     first_name: str
4     last_name: str
5     session_id: str
6     user_name: str
7 
8 
9 @tool
10 async def search_facts(state: State, query: str, limit: int = 5) -> list[str]:
11     """Search for facts in all conversations had with a user.
12     
13     Args:
14         state (State): The Agent's state.
15         query (str): The search query.
16         limit (int): The number of results to return. Defaults to 5.
17 
18     Returns:
19         list: A list of facts that match the search query.
20     """
21     edges = await zep.graph.search(
22         user_id=state["user_name"], text=query, limit=limit, search_scope="edges"
23     )
24     return [edge.fact for edge in edges]
25 
26 
27 @tool
28 async def search_nodes(state: State, query: str, limit: int = 5) -> list[str]:
29     """Search for nodes in all conversations had with a user.
30     
31     Args:
32         state (State): The Agent's state.
33         query (str): The search query.
34         limit (int): The number of results to return. Defaults to 5.
35 
36     Returns:
37         list: A list of node summaries for nodes that match the search query.
38     """
39     nodes = await zep.graph.search(
40         user_id=state["user_name"], text=query, limit=limit, search_scope="nodes"
41     )
42     return [node.summary for node in nodes]
43 
44 
45 tools = [search_facts, search_nodes]
46 
47 tool_node = ToolNode(tools)
48 
49 llm = ChatOpenAI(model="gpt-4o-mini", temperature=0).bind_tools(tools)

Chatbot Function Explanation

The chatbot uses Zep to provide context-aware responses. Here’s how it works:

Context Retrieval: It retrieves relevant facts for the user’s current conversation (session). Zep uses the most recent messages to determine what facts to retrieve.
System Message: It constructs a system message incorporating the facts retrieved in 1., setting the context for the AI’s response.
Message Persistence: After generating a response, it asynchronously adds the user and assistant messages to Zep. New Facts are created and existing Facts updated using this new information.
Messages in State: We use LangGraph state to store the most recent messages and add these to the Agent prompt. We limit the message list to the most recent 3 messages for demonstration purposes.

We could also use Zep to recall the chat history, rather than LangGraph’s MemorySaver.

See memory.get in the Zep SDK documentation.

1 async def chatbot(state: State):
2     memory = await zep.memory.get(state["session_id"])
3 
4     system_message = SystemMessage(
5         content=f"""You are a compassionate mental health bot and caregiver. Review information about the user and their prior conversation below and respond accordingly.
6         Keep responses empathetic and supportive. And remember, always prioritize the user's well-being and mental health.
7 
8         {memory.context}"""
9     )
10 
11     messages = [system_message] + state["messages"]
12 
13     response = await llm.ainvoke(messages)
14 
15     # Add the new chat turn to the Zep graph
16     messages_to_save = [
17         Message(
18             role_type="user",
19             role=state["first_name"] + " " + state["last_name"],
20             content=state["messages"][-1].content,
21         ),
22         Message(role_type="assistant", content=response.content),
23     ]
24 
25     await zep.memory.add(
26         session_id=state["session_id"],
27         messages=messages_to_save,
28     )
29 
30     # Truncate the chat history to keep the state from growing unbounded
31     # In this example, we going to keep the state small for demonstration purposes
32     # We'll use Zep's Facts to maintain conversation context
33     state["messages"] = trim_messages(
34         state["messages"],
35         strategy="last",
36         token_counter=len,
37         max_tokens=3,
38         start_on="human",
39         end_on=("human", "tool"),
40         include_system=True,
41     )
42 
43     logger.info(f"Messages in state: {state['messages']}")
44 
45     return {"messages": [response]}

Setting up the Agent

This section sets up the Agent’s LangGraph graph:

Graph Structure: It defines a graph with nodes for the agent (chatbot) and tools, connected in a loop.
Conditional Logic: The should_continue function determines whether to end the graph execution or continue to the tools node based on the presence of tool calls.
Memory Management: It uses a MemorySaver to maintain conversation state across turns. This is in addition to using Zep for facts.

1 graph_builder = StateGraph(State)
2 
3 memory = MemorySaver()
4 
5 
6 # Define the function that determines whether to continue or not
7 async def should_continue(state, config):
8     messages = state["messages"]
9     last_message = messages[-1]
10     # If there is no function call, then we finish
11     if not last_message.tool_calls:
12         return "end"
13     # Otherwise if there is, we continue
14     else:
15         return "continue"
16 
17 
18 graph_builder.add_node("agent", chatbot)
19 graph_builder.add_node("tools", tool_node)
20 
21 graph_builder.add_edge(START, "agent")
22 
23 graph_builder.add_conditional_edges("agent", should_continue, {"continue": "tools", "end": END})
24 
25 graph_builder.add_edge("tools", "agent")
26 
27 
28 graph = graph_builder.compile(checkpointer=memory)

Our LangGraph agent graph is illustrated below.

Agent Graph

Running the Agent

We generate a unique user name and thread id (session id) and add these to Zep, associating the Session with the new User.

1 first_name = "Daniel"
2 last_name = "Chalef"
3 user_name = first_name + uuid.uuid4().hex[:4]
4 thread_id = uuid.uuid4().hex
5 
6 await zep.user.add(user_id=user_name, first_name=first_name, last_name=last_name)
7 await zep.memory.add_session(session_id=thread_id, user_id=user_name)
8 
9 
10 def extract_messages(result):
11     output = ""
12     for message in result["messages"]:
13         if isinstance(message, AIMessage):
14             role = "assistant"
15         else:
16             role = result["user_name"]
17         output += f"{role}: {message.content}\n"
18     return output.strip()
19 
20 
21 async def graph_invoke(
22     message: str,
23     first_name: str,
24     last_name: str,
25     thread_id: str,
26     ai_response_only: bool = True,
27 ):
28     r = await graph.ainvoke(
29         {
30             "messages": [
31                 {
32                     "role": "user",
33                     "content": message,
34                 }
35             ],
36             "first_name": first_name,
37             "last_name": last_name,
38             "session_id": thread_id,
39         },
40         config={"configurable": {"thread_id": thread_id}},
41     )
42 
43     if ai_response_only:
44         return r["messages"][-1].content
45     else:
46         return extract_messages(r)

Let’s test the agent with a few messages:

1 r = await graph_invoke(
2     "Hi there?",
3     first_name,
4     last_name,
5     thread_id,
6 )
7 
8 print(r)

Hello! How are you feeling today? I’m here to listen and support you.

1 r = await graph_invoke(
2     """
3     I'm fine. But have been a bit stressful lately. Mostly work related. 
4     But also my dog. I'm worried about her.
5     """,
6     first_name,
7     last_name,
8     thread_id,
9 )
10 
11 print(r)

I’m sorry to hear that you’ve been feeling stressed. Work can be a significant source of pressure, and it sounds like your dog might be adding to that stress as well. If you feel comfortable sharing, what specifically has been causing you stress at work and with your dog? I’m here to help you through it.

Viewing The Context Value

1 memory = await zep.memory.get(session_id=thread_id)
2 
3 print(memory.context)

The context value will look something like this:

FACTS and ENTITIES represent relevant context to the current conversation.
# These are the most relevant facts and their valid date ranges
# format: FACT (Date range: from - to)
<FACTS>
  - Daniel99db is worried about his sick dog. (2025-01-24 02:11:54 - present)
  - Daniel Chalef is worried about his sick dog. (2025-01-24 02:11:54 - present)
  - The assistant asks how the user is feeling. (2025-01-24 02:11:51 - present)
  - Daniel99db has been a bit stressful lately due to his dog. (2025-01-24 02:11:53 - present)
  - Daniel99db has been a bit stressful lately due to work. (2025-01-24 02:11:53 - present)
  - Daniel99db is a user. (2025-01-24 02:11:51 - present)
  - user has the id of Daniel99db (2025-01-24 02:11:50 - present)
  - user has the name of Daniel Chalef (2025-01-24 02:11:50 - present)
</FACTS>
# These are the most relevant entities
# ENTITY_NAME: entity summary
<ENTITIES>
  - worried: Daniel Chalef (Daniel99db) is feeling stressed lately, primarily due to work-related issues and concerns about his sick dog, which has made him worried.
  - Daniel99db: Daniel99db, or Daniel Chalef, is currently experiencing stress primarily due to work-related issues and concerns about his sick dog. Despite these challenges, he has shown a desire for interaction by initiating conversations, indicating his openness to communication.
  - sick: Daniel Chalef, also known as Daniel99db, is feeling stressed lately, primarily due to work-related issues and concerns about his sick dog. He expresses worry about his dog's health.
  - Daniel Chalef: Daniel Chalef, also known as Daniel99db, has been experiencing stress recently, primarily related to work issues and concerns about his sick dog. Despite this stress, he has been feeling generally well and has expressed a desire to connect with others, as indicated by his friendly greeting, "Hi there?".
  - dog: Daniel99db, also known as Daniel Chalef, mentioned that he has been feeling a bit stressed lately, which is related to both work and his dog.
  - work: Daniel Chalef, also known as Daniel99db, has been experiencing stress lately, primarily related to work.
  - feeling: The assistant initiates a conversation by asking how the user is feeling today, indicating a willingness to listen and provide support.
</ENTITIES>

1 r = await graph_invoke(
2     "She ate my shoes which were expensive.",
3     first_name,
4     last_name,
5     thread_id,
6 )
7 
8 print(r)

That sounds really frustrating, especially when you care so much about your belongings and your dog’s health. It’s tough when pets get into things they shouldn’t, and it can add to your stress. How are you feeling about that situation? Are you able to focus on her health despite the shoe incident?

Let’s now test whether the Agent is correctly grounded with facts from the prior conversation.

1 r = await graph_invoke(
2     "What are we talking about?",
3     first_name,
4     last_name,
5     thread_id,
6 )
7 
8 print(r)

We were discussing your concerns about your dog being sick and the situation with her eating your expensive shoes. It sounds like you’re dealing with a lot right now, and I want to make sure we’re addressing what’s on your mind. If there’s something else you’d like to talk about or if you want to share more about your dog, I’m here to listen.

Let’s go even further back to determine whether context is kept by referencing a user message that is not currently in the Agent State. Zep will retrieve Facts related to the user’s job.

1 r = await graph_invoke(
2     "What have I said about my job?",
3     first_name,
4     last_name,
5     thread_id,
6 )
7 
8 print(r)

You’ve mentioned that you’ve been feeling a bit stressed lately, primarily due to work-related issues. If you’d like to share more about what’s been going on at work or how it’s affecting you, I’m here to listen and support you.