Retrieving Memory
There are two ways to retrieve memory from a User Graph: using Zep’s Context Block or searching the graph.
Using Zep’s Context Block
Zep’s Context Block is an optimized, automatically assembled string that you can directly provide as context to your agent. Zep’s Context Block combines semantic search, full text search, and breadth first search to return context that is highly relevant to the user’s current conversation slice, utilizing the past four messages.
The Context Block is returned by the thread.get_user_context()
method. This method uses the latest messages of the given thread to search the (entire) User Graph and then returns the search results in the form of the Context Block.
Note that although thread.get_user_context()
only requires a thread ID, it is able to return memory derived from any thread of that user. The thread is just used to determine what’s relevant.
The mode
parameter determines what form the Context Block takes (see below).
Summarized Context Block (default)
This Context Block type returns a short summary of the relevant context.
Benefits:
- Low token usage
- Easier for LLMs to understand
Trade-offs:
- Higher latency
- Some risk of missing important details
Example:
Basic Context Block (faster)
This Context Block type returns the relevant context in a more raw format, but faster.
Benefits:
- Lower latency (P95 < 200ms)
- More detailed information preserved
Trade-offs:
- Higher token usage
- May be harder for some LLMs to parse
Example:
Getting the Context Block Sooner
You can get the Context Block sooner by passing in the return_context=True
flag to the thread.add_messages()
method, but it will always return the basic Context Block type. Read more about this in our performance guide.
Searching the Graph
You can also directly search a User Graph using our highly customizable graph.search
method and construct a custom context block. Read more about this in our Searching the Graph guide.