Zep provides three methods for retrieving context from a User Graph, each offering different levels of control and customization.
Zep’s Context Block is an optimized, automatically assembled string that you can directly provide as context to your agent. It is built using Smart Context Assembly (i.e. auto search). The Context Block combines semantic search, full text search, and breadth first search to return context that is highly relevant to the user’s current conversation slice, utilizing the past two messages.
The Context Block is returned by the thread.get_user_context() method. This method uses the latest messages of the given thread to search the (entire) User Graph and then returns the search results in the form of the Context Block.
Note that although thread.get_user_context() only requires a thread ID, it is able to return context derived from any thread of that user. The thread is just used to determine what’s relevant.
The Context Block provides low latency (P95 < 200ms) while preserving detailed information from the user’s graph.
The Context Block returns a user summary along with relevant facts in a structured format:
The default Context Block can include the user summary, facts, entities, episodes, observations, and thread summaries. Smart Context Assembly selects which context types appear based on relevance to the current conversation. To pin specific types or counts, use a context template or advanced context block construction.
You can get the Context Block sooner by passing in the return_context=True flag to the thread.add_messages() method. Read more about this in our performance guide.
You can customize the format of the Context Block by using context templates. Templates allow you to define how context data is structured and presented while keeping Zep’s automatic relevance detection.
To use a template, pass the template_id parameter when retrieving context:
See the Context Templates guide to learn how to create and manage templates.
For maximum control over context retrieval, see our Advanced Context Block Construction cookbook. This approach lets you directly search the graph and assemble results with complete control over search queries, parameters, and formatting.
Once you’ve retrieved the Context Block, used a custom context template, or constructed your own context block, you can include this string in your system prompt:
You should also include the last 4 to 6 messages of the thread when calling your LLM provider. Because Zep’s ingestion can take a few minutes, the context block may not include information from the last few messages; and so the context block acts as the “long-term context,” and the last few messages serve as the raw, short-term context.