# Welcome to Zep! Connect your AI coding assistant to Zep's docs: [MCP server & llms.txt](/coding-with-llms) Zep is a context engineering platform that systematically assembles personalized context—user preferences, traits, and business data—for reliable agent applications. Zep combines agent memory, Graph RAG, and context assembly capabilities to deliver comprehensive personalized context that reduces hallucinations and improves accuracy. Learn about Zep's context engineering platform, temporal knowledge graphs, and agent memory capabilities. Get up and running with Zep in minutes, whether you code in Python, TypeScript, or Go. Discover practical recipes and patterns for common use cases with Zep. Comprehensive API documentation for Zep's SDKs in Python, TypeScript, and Go. Migrate from Mem0 to Zep in minutes. Learn about Graphiti, Zep's open-source temporal knowledge graph framework. # Key Concepts > Understanding Zep's context engineering platform and temporal knowledge graphs. Looking to just get coding? Check out our [Quickstart](/quickstart). Zep is a context engineering platform that systematically assembles personalized context—user preferences, traits, and business data—for reliable agent applications. Zep combines Graph RAG, agent memory, and context assembly capabilities to deliver comprehensive personalized context that reduces hallucinations and improves accuracy. | Concept | Description | Docs | | ------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------- | | Knowledge Graph | Zep's unified knowledge store for agents. Nodes represent entities, edges represent facts/relationships. The graph updates dynamically in response to new data. | [Docs](/understanding-the-graph) | | Zep's Context Block | Optimized string containing a user summary and facts from the knowledge graph most relevant to the current thread. Also contains dates when facts became valid and invalid. Provide this to your chatbot as context. | [Docs](/retrieving-context#zeps-context-block) | | Fact Invalidation | When new data invalidates a prior fact, the time the fact became invalid is stored on that fact's edge in the knowledge graph. | [Docs](/facts) | | JSON/text/message | Types of data that can be ingested into the knowledge graph. Can represent business data, documents, chat messages, emails, etc. | [Docs](/adding-data-to-the-graph) | | Custom Entity/Edge Types | Feature allowing use of Pydantic-like classes to customize creation/retrieval of entities and relations in the knowledge graph. | [Docs](/customizing-graph-structure#custom-entity-and-edge-types) | | Graph | Represents an arbitrary knowledge graph for storing up-to-date knowledge about an object or system. For storing up-to-date knowledge about a user, a user graph should be used. | [Docs](/graph-overview) | | User Graph | Special type of graph for storing personalized context for a user of your application. | [Docs](/users) | | User | A user in Zep represents a user of your application, and has its own User Graph and thread history. | [Docs](/users) | | Threads | Conversation threads of a user. By default, all messages added to any thread of that user are ingested into that user's graph. | [Docs](/threads) | | `graph.add` & `thread.add_messages` | Methods for adding data to a graph and user graph respectively. | [Docs](/adding-data-to-the-graph) [Docs](/adding-messages) | | `graph.search` & `thread.get_user_context` | Low level and high level methods respectively for retrieving from the knowledge graph. | [Docs](/searching-the-graph) [Docs](/retrieving-context) | | User Summary Instructions | Customize how Zep generates entity summaries for users in their knowledge graph. Up to 5 custom instructions per user to guide summary generation. | [Docs](/users#user-summary-instructions) | | Agentic Tool | Use Zep's context retrieval methods as agentic tools, enabling your agent to query for relevant information from the user's knowledge graph. | [Docs](/quickstart#use-zep-as-an-agentic-tool) | # Zep use cases > Common applications for Zep's context engineering platform. | Use case | Purpose | Implementation | | ----------------- | ------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Dynamic Graph RAG | Provide your agent with up-to-date knowledge of an object/system | Add/stream all relevant data to a Graph ([docs](/adding-data-to-the-graph)), chunking first if needed ([docs](/adding-data-to-the-graph#data-size-limit-and-chunking)), and retrieve from the graph by constructing a custom context block ([docs](/cookbook/advanced-context-block-construction)) | | Agent memory | Provide your agent with up-to-date knowledge of a user | Add/stream user messages and user business data to a User Graph ([docs](/adding-messages)), and retrieve user context as the context block returned from `thread.get_user_context` ([docs](/retrieving-context)), and provide this context block to your agent before responding | | Voice agents | Provide up-to-date knowledge with extremely low latency to a voice agent | Similar to other implementations, except incorporating latency optimizations ([docs](/performance)) | # What is context engineering? > The discipline of assembling all necessary information for reliable agent applications. Context Engineering is the discipline of assembling all necessary information, instructions, and tools around a LLM to help it accomplish tasks reliably. Unlike simple prompt engineering, context engineering involves building dynamic systems that provide the right information in the right format so LLMs can perform consistently. The core challenge: LLMs are stateless and only know what's in their immediate context window. Context engineering bridges this gap by systematically providing relevant background knowledge, user history, business data, and tool outputs. Using [business data and/or user chat histories](/concepts#business-data-vs-chat-message-data), Zep automatically constructs a [temporal knowledge graph](/graph-overview) to reflect the state of an object/system or a user. The knowledge graph contains entities, relationships, and facts related to your object/system or user. As facts change or are superseded, [Zep updates the graph](/concepts#managing-changes-in-facts-over-time) to reflect their new state. Through systematic context engineering, Zep provides your agent with the comprehensive information needed to deliver personalized responses and solve problems. This reduces hallucinations, improves accuracy, and reduces the cost of LLM calls. # How Zep fits into your application > Understanding how Zep integrates with your application architecture. Your application sends Zep business data (JSON, unstructured text) and/or messages. Business data sources may include CRM applications, emails, billing data, or conversations on other communication platforms like Slack. Zep automatically fuses this data together on a temporal knowledge graph, building a holistic view of the object/system or user and the relationships between entities. Zep offers a number of APIs for [adding and retrieving context](/retrieving-context). In addition to populating a prompt with Zep's engineered context, Zep's search APIs can be used to build [agentic tools](/concepts#using-zep-as-an-agentic-tool). The example below shows Zep's context block resulting from a call to `thread.get_user_context()`. This is Zep's engineered context block that can be added to your prompt and contains a user summary and facts relevant to the current conversation with a user. For more about the temporal context of facts, see [Managing changes in facts over time](/concepts#managing-changes-in-facts-over-time). ## Context Block [Zep's Context Block](/retrieving-context#zeps-context-block) is Zep's engineered context string containing a user summary and relevant facts for the thread. It is always present in the result of `thread.get_user_context()` call and can be optionally [received with the response of `thread.add_messages()` call](/performance#get-the-context-block-sooner). The Context Block provides low latency (P95 \< 200ms) while preserving detailed information from the user's graph. Read more about Zep's Context Block [here](/retrieving-context#zeps-context-block). ```python Python # Get context for the thread user_context = client.thread.get_user_context(thread_id=thread_id) # Access the context block (for use in prompts) context_block = user_context.context print(context_block) ``` ```typescript TypeScript // Get context for the thread const userContext = await client.thread.getUserContext(threadId); // Access the context block (for use in prompts) const contextBlock = userContext.context; console.log(contextBlock); ``` ```go Go import ( "context" v3 "github.com/getzep/zep-go/v3" ) // Get context for the thread userContext, err := client.Thread.GetUserContext(context.TODO(), threadId, nil) if err != nil { log.Fatal("Error getting context:", err) } // Access the context block (for use in prompts) contextBlock := userContext.Context fmt.Println(contextBlock) ``` The Context Block includes a user summary and relevant facts: ```text # This is the user summary Emily Painter is a user with account ID Emily0e62 who uses digital art tools for creative work. She maintains an active account with the service, though has recently experienced technical issues with the Magic Pen Tool. Emily values reliable payment processing and seeks prompt resolution for account-related issues. She expects clear communication and efficient support when troubleshooting technical problems. # These are the most relevant facts and their valid date ranges # format: FACT (Date range: from - to) - Emily is experiencing issues with logging in. (2024-11-14 02:13:19+00:00 - present) - User account Emily0e62 has a suspended status due to payment failure. (2024-11-14 02:03:58+00:00 - present) - user has the id of Emily0e62 (2024-11-14 02:03:54 - present) - The failed transaction used a card with last four digits 1234. (2024-09-15 00:00:00+00:00 - present) - The reason for the transaction failure was 'Card expired'. (2024-09-15 00:00:00+00:00 - present) - user has the name of Emily Painter (2024-11-14 02:03:54 - present) - Account Emily0e62 made a failed transaction of 99.99. (2024-07-30 00:00:00+00:00 - 2024-08-30 00:00:00+00:00) ``` You can then include this context in your system prompt: | MessageType | Content | | ----------- | ------------------------------------------------------ | | `System` | Your system prompt

`{Zep context block}` | | `Assistant` | An assistant message stored in Zep | | `User` | A user message stored in Zep | | ... | ... | | `User` | The latest user message | # Retrieval philosophy > Understanding Zep's approach to optimizing for recall and latency. Zep's retrieval system is designed with two primary goals: **high recall** and **low latency**. This is a deliberate architectural choice that differs from systems optimized for precision. ## Understanding recall vs. precision Think of recall and precision as two different ways to measure retrieval quality: * **Recall** measures completeness: "Did we find all the relevant information?" * **Precision** measures accuracy: "Is everything we returned actually relevant?" In practical terms: * **High recall** means you get all the relevant results, but might also get some less relevant ones * **High precision** means everything returned is highly relevant, but you might miss important information ## The tradeoff in practice | Approach | What You Get | What You Risk | Best For | | ---------------------------------------- | --------------------------------------------------- | ---------------------------------------------------- | ------------------------------------------------------------------------------- | | **Optimize for Recall** (Zep's approach) | All relevant facts, plus some less relevant results | Larger context with some noise | Agents that need complete information to make decisions; real-time applications | | **Optimize for Precision** | Only highly relevant results | Missing critical facts that could cause task failure | Use cases where context size is severely constrained; manual review workflows | ### Example scenario User query: *"What did we discuss about the Q2 marketing budget?"* | Retrieval Approach | Results Returned | Outcome | | -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Recall-Optimized** (Zep) | • Q2 marketing budget discussion ✓
• Related Q2 sales projections ✓
• Q3 budget planning mention ⚠️
• Q2 hiring costs mentioning marketing ⚠️ | Agent has complete context, including tangentially related information. Can successfully answer follow-up questions about budget revisions. | | **Precision-Optimized** | • Q2 marketing budget discussion ✓
• Related Q2 sales projections ✓ | Clean, focused results, but **missing** a separate conversation about budget revisions that didn't explicitly mention "marketing budget." Agent may provide incomplete information. | ## Why recall over precision? Agents need comprehensive context to make informed decisions. Missing a critical fact can cause an agent to fail its task or provide incorrect information. By optimizing for recall, Zep ensures that relevant information is available to the agent, even if that means returning more results than strictly necessary. The underlying principle: it's better to provide complete information and let the agent or downstream LLM filter what's relevant than to risk omitting something important. ## Why latency matters Real-time applications like conversational AI, live customer support, and interactive agents require fast responses. Zep's retrieval architecture is optimized to return results in milliseconds, enabling seamless user experiences without perceptible delays. ## Tuning the recall-precision tradeoff The recall-optimized approach described here is how Zep is tuned **out of the box**. However, Zep provides several mechanisms to adjust this tradeoff for different use cases: * **Limit search results**: Control the maximum number of results returned * **Apply filters**: Narrow retrieval to specific time ranges, Entity and/or Edge labels, or other criteria * **Adjust search parameters**: Fine-tune ranking and relevance thresholds These controls allow you to shift toward precision when your application demands it, while maintaining Zep's fast retrieval performance. ## Balancing context size While recall is our priority, Zep does consider token count when returning results. We balance the size of the resulting context with the goal of providing complete information, but when in doubt, we err on the side of ensuring your agent has what it needs to succeed. # Users and User Graphs > Understanding user management and knowledge graph integration in Zep ## Overview A User represents an individual interacting with your application. Each User can have multiple Threads associated with them, allowing you to track and manage their interactions over time. Additionally, each user has an associated User Graph which stores the context for that user. ## Users The unique identifier for each user is their `UserID`. This can be any string value, such as a username, email address, or UUID. **Users Enable Simple User Privacy Management** Deleting a User will delete all Threads and thread artifacts associated with that User with a single API call, making it easy to handle Right To Be Forgotten requests. ### Ensuring Your User Data Is Correctly Mapped to the Knowledge Graph Adding your user's `email`, `first_name`, and `last_name` ensures that chat messages and business data are correctly mapped to the user node in the Zep knowledge graph. For example, if business data contains your user's email address, it will be related directly to the user node. You can associate rich business context with a User: * `user_id`: A unique identifier of the user that maps to your internal User ID. * `email`: The user's email. * `first_name`: The user's first name. * `last_name`: The user's last name. ## User Graphs Each user has an associated User Graph that stores their context across all threads. This graph-based context system provides several important capabilities: ### Cross-Thread Context Integration The knowledge graph does not separate the data from different threads, but integrates the data together to create a unified picture of the user. So the `thread.get_user_context` method doesn't return context derived only from that thread, but instead returns whatever user-level context is most relevant to that thread, based on the thread's most recent messages. This means that insights and information learned in one conversation thread are automatically available in all other threads for the same user, creating a coherent and continuous context experience. ### Privacy and RTBF Capabilities When you delete a user, all associated data is removed: * All threads belonging to that user * All thread artifacts (messages, metadata) * The entire user graph and all knowledge extracted from conversations This single-operation approach makes it simple to handle Right To Be Forgotten (RTBF) requests and comply with privacy regulations. ### Default Ontology for User Graphs User graphs utilize Zep's default ontology, consisting of default entity types and default edge types that affect how the graph is built. You can read more about default and custom graph ontology in the [Customizing Graph Structure](/customizing-graph-structure) guide. Each user graph comes with default entity and edge types that help classify and structure information extracted from conversations. You can also disable the default entity and edge types for specific users if you need precise control over your graph structure. ### The User Node **User summary and the user node** Each user has a single unique user node in their graph representing the user themselves. The user summary generated from user summary instructions lives on this user node. You can retrieve the user node and its summary using the `get_node` method described in the SDK reference. The user node serves as a central hub in the knowledge graph, connecting all information about that user. It stores a high-level summary of the user that can be customized through [User Summary Instructions](/user-summary-instructions). ## Next Steps Now that you understand how Users and User Graphs work together, you can: * Learn about [Threads](/threads) and how they relate to users * Discover how to [add messages to threads](/adding-messages) * Learn how to [retrieve context for your agent](/retrieving-context) * Explore [customizing user summaries](/user-summary-instructions) * Understand more about [Graph Concepts](/graph-overview) # Threads > Understanding conversation threading in Zep ## Overview Threads represent a conversation. Each User can have multiple threads, and each thread is a sequence of chat messages. Chat messages are added to threads using [`thread.add_messages`](/adding-messages), which both adds those messages to the thread history and ingests those messages into the user-level knowledge graph. The user knowledge graph contains data from all of that user's threads to create an integrated understanding of the user. ## Relationship Between Users and Threads `threadIds` are arbitrary identifiers that you can map to relevant business objects in your app, such as users or a conversation a user might have with your app. Before you create a thread, make sure you have created a user first. ## Automatic Cache Warming When you create a new thread, Zep automatically warms the cache for that user's graph data in the background. This optimization improves query latency for graph operations on newly created threads by pre-loading the user's data into the hot cache tier. The warming operation runs asynchronously and does not block the thread creation response. No additional action is required on your part—this happens automatically whenever you create a thread for a user with an existing graph. For more information about Zep's multi-tier caching architecture and manual cache warming, see [Warming the User Cache](/performance#warming-the-user-cache). ## Next Steps Now that you understand how Threads work, you can: * Learn about [Users and User Graphs](/users-and-user-graphs) * Discover how to [add messages to threads](/adding-messages) * Learn how to [retrieve context for your agent](/retrieving-context) * Understand more about [Graph Concepts](/graph-overview) # Graph Overview