Autogen memory integration | Zep Documentation

The zep-autogen package provides seamless integration between Zep and Microsoft Autogen agents. Choose between user-specific conversation memory or structured knowledge graph memory for intelligent context retrieval.

Install dependencies

$ pip install zep-autogen zep-cloud autogen-core autogen-agentchat

Environment setup

Set your API keys as environment variables:

$ export ZEP_API_KEY="your_zep_api_key"
> export OPENAI_API_KEY="your_openai_api_key"

Memory types

User Memory: Stores conversation history in user threads with automatic context injection
Knowledge Graph Memory: Maintains structured knowledge with custom entity models

User memory

Step 1: Setup required imports

1 import os
2 import uuid
3 import asyncio
4 from autogen_agentchat.agents import AssistantAgent
5 from autogen_ext.models.openai import OpenAIChatCompletionClient
6 from autogen_core.memory import MemoryContent, MemoryMimeType
7 from zep_cloud.client import AsyncZep
8 from zep_autogen import ZepUserMemory

Step 2: Initialize client and create user

1 # Initialize Zep client
2 zep_client = AsyncZep(api_key=os.environ.get("ZEP_API_KEY"))
3 user_id = f"user_{uuid.uuid4().hex[:16]}"
4 thread_id = f"thread_{uuid.uuid4().hex[:16]}"
5 
6 # Create user (required before using memory)
7 try:
8     await zep_client.user.add(
9         user_id=user_id,
10         email="[email protected]",
11         first_name="Alice"
12     )
13 except Exception as e:
14     print(f"User might already exist: {e}")
15 
16 # Create thread (required for conversation memory)
17 try:
18     await zep_client.thread.create(thread_id=thread_id, user_id=user_id)
19 except Exception as e:
20     print(f"Thread creation failed: {e}")

Step 3: Create memory with configuration

1 # Create user memory with configuration
2 memory = ZepUserMemory(
3     client=zep_client,
4     user_id=user_id,
5     thread_id=thread_id
6 )

Step 4: Create agent with memory

1 # Create agent with Zep memory
2 agent = AssistantAgent(
3     name="MemoryAwareAssistant",
4     model_client=OpenAIChatCompletionClient(
5         model="gpt-4.1-mini",
6         api_key=os.environ.get("OPENAI_API_KEY")
7     ),
8     memory=[memory],
9     system_message="You are a helpful assistant with persistent memory."
10 )

Step 5: Store messages and run conversations

1 # Helper function to store messages with proper metadata
2 async def add_message(message: str, role: str, name: str = None):
3     """Store a message in Zep memory following AutoGen standards."""
4     metadata = {"type": "message", "role": role}
5     if name:
6         metadata["name"] = name
7     
8     await memory.add(MemoryContent(
9         content=message,
10         mime_type=MemoryMimeType.TEXT,
11         metadata=metadata
12     ))
13 
14 # Example conversation with memory persistence
15 user_message = "My name is Alice and I love hiking in the mountains."
16 print(f"User: {user_message}")
17 
18 # Store user message
19 await add_message(user_message, "user", "Alice")
20 
21 # Run agent - it will automatically retrieve context via update_context()
22 response = await agent.run(task=user_message)
23 agent_response = response.messages[-1].content
24 print(f"Agent: {agent_response}")
25 
26 # Store agent response
27 await add_message(agent_response, "assistant")

Automatic Context Injection: ZepUserMemory automatically injects relevant conversation history and context via the update_context() method. The agent receives up to 10 recent messages plus context from Zep’s Context Block.

Knowledge graph memory

Step 1: Define custom entity models

1 from zep_autogen.graph_memory import ZepGraphMemory
2 from zep_cloud.external_clients.ontology import EntityModel, EntityText
3 from pydantic import Field
4 
5 # Define entity models using Pydantic
6 class ProgrammingLanguage(EntityModel):
7     """A programming language entity."""
8     paradigm: EntityText = Field(
9         description="programming paradigm (e.g., object-oriented, functional)",
10         default=None
11     )
12     use_case: EntityText = Field(
13         description="primary use cases for this language",
14         default=None
15     )
16 
17 class Framework(EntityModel):
18     """A software framework or library."""
19     language: EntityText = Field(
20         description="the programming language this framework is built for",
21         default=None
22     )
23     purpose: EntityText = Field(
24         description="primary purpose of this framework",
25         default=None
26     )

Step 2: Setup graph with ontology

1 from zep_cloud import SearchFilters
2 
3 # Set ontology first
4 await zep_client.graph.set_ontology(
5     entities={
6         "ProgrammingLanguage": ProgrammingLanguage,
7         "Framework": Framework,
8     }
9 )
10 
11 # Create graph
12 graph_id = f"graph_{uuid.uuid4().hex[:16]}"
13 try:
14     await zep_client.graph.create(
15         graph_id=graph_id,
16         name="Programming Knowledge Graph"
17     )
18     print(f"Created graph: {graph_id}")
19 except Exception as e:
20     print(f"Graph creation failed: {e}")

Step 3: Initialize graph memory with filters

1 # Create graph memory with search configuration
2 graph_memory = ZepGraphMemory(
3     client=zep_client,
4     graph_id=graph_id,
5     search_filters=SearchFilters(
6         node_labels=["ProgrammingLanguage", "Framework"]
7     ),
8     facts_limit=20,  # Max facts in context injection (default: 20)
9     entity_limit=5   # Max entities in context injection (default: 5)
10 )

Step 4: Add data and wait for indexing

1 # Add structured knowledge
2 await graph_memory.add(MemoryContent(
3     content="Python is excellent for data science and AI development",
4     mime_type=MemoryMimeType.TEXT,
5     metadata={"type": "data"}  # "data" stores in graph, "message" stores as episode
6 ))
7 
8 # Wait for graph processing (required)
9 print("Waiting for graph indexing...")
10 await asyncio.sleep(30)  # Allow time for knowledge extraction
11 
12 <Callout intent="info">
13 **Graph Memory Context Injection**: ZepGraphMemory automatically retrieves the last 2 episodes from the graph and uses their content to query for relevant facts (up to `facts_limit`) and entities (up to `entity_limit`). This context is injected as a system message during agent interactions.
14 </Callout>

Step 5: Create agent with graph memory

1 # Create agent with graph memory
2 agent = AssistantAgent(
3     name="GraphMemoryAssistant",
4     model_client=OpenAIChatCompletionClient(model="gpt-4.1-mini"),
5     memory=[graph_memory],
6     system_message="You are a technical assistant with programming knowledge."
7 )

Tools integration

Zep tools allow agents to search and add data directly to memory storage with manual control and structured responses.

Important: Tools must be bound to either graph_id OR user_id, not both. This determines whether they operate on knowledge graphs or user graphs.

Tool function parameters

Search Tool Parameters:

query: str (required) - Search query text
limit: int (optional, default 10) - Maximum results to return
scope: str (optional, default “edges”) - Search scope: “edges”, “nodes”, “episodes”

Add Tool Parameters:

data: str (required) - Content to store
data_type: str (optional, default “text”) - Data type: “text”, “json”, “message”

User graph tools

1 from zep_autogen import create_search_graph_tool, create_add_graph_data_tool
2 
3 # Create tools bound to user graph
4 search_tool = create_search_graph_tool(zep_client, user_id=user_id)
5 add_tool = create_add_graph_data_tool(zep_client, user_id=user_id)
6 
7 # Agent with user graph tools
8 agent = AssistantAgent(
9     name="UserKnowledgeAssistant",
10     model_client=OpenAIChatCompletionClient(model="gpt-4.1-mini"),
11     tools=[search_tool, add_tool],
12     system_message="You can search and add data to the user's knowledge graph.",
13     reflect_on_tool_use=True  # Enables tool usage reflection
14 )

Knowledge graph tools

1 # Create tools bound to knowledge graph
2 search_tool = create_search_graph_tool(zep_client, graph_id=graph_id)
3 add_tool = create_add_graph_data_tool(zep_client, graph_id=graph_id)
4 
5 # Agent with knowledge graph tools
6 agent = AssistantAgent(
7     name="KnowledgeGraphAssistant",
8     model_client=OpenAIChatCompletionClient(model="gpt-4.1-mini"),
9     tools=[search_tool, add_tool],
10     system_message="You can search and add data to the knowledge graph.",
11     reflect_on_tool_use=True
12 )

Query memory

Both memory types support direct querying with different scope parameters.

User memory queries

1 # Query user conversation history  
2 results = await memory.query("What does Alice like?", limit=5)
3 
4 # Process different result types
5 for result in results.results:
6     content = result.content
7     metadata = result.metadata
8     
9     if 'edge_name' in metadata:
10         # Fact/relationship result
11         print(f"Fact: {content}")
12         print(f"Relationship: {metadata['edge_name']}")
13         print(f"Valid: {metadata.get('valid_at', 'N/A')} - {metadata.get('invalid_at', 'present')}")
14     elif 'node_name' in metadata:
15         # Entity result
16         print(f"Entity: {metadata['node_name']}")
17         print(f"Summary: {content}")
18     else:
19         # Episode/message result
20         print(f"Message: {content}")
21         print(f"Role: {metadata.get('episode_role', 'unknown')}")
22     
23     print(f"Source: {metadata.get('source')}\n")

Graph memory queries

1 # Query knowledge graph with scope control
2 facts_results = await graph_memory.query(
3     "Python frameworks", 
4     limit=10, 
5     scope="edges"  # "edges" (facts), "nodes" (entities), "episodes" (messages)
6 )
7 
8 print(f"Found {len(facts_results.results)} facts about Python frameworks:")
9 for result in facts_results.results:
10     print(f"- {result.content}")
11     
12 entities_results = await graph_memory.query(
13     "programming languages",
14     limit=5,
15     scope="nodes"
16 )
17 
18 print(f"\nFound {len(entities_results.results)} programming language entities:")
19 for result in entities_results.results:
20     entity_name = result.metadata.get('node_name', 'Unknown')
21     print(f"- {entity_name}: {result.content}")

Search result structure

Edge results (facts)

1 {
2     "content": "fact text",
3     "metadata": {
4         "source": "graph" | "user_graph",
5         "edge_name": "relationship_name",
6         "edge_attributes": {...},
7         "created_at": "timestamp",
8         "valid_at": "timestamp",
9         "invalid_at": "timestamp",
10         "expired_at": "timestamp"
11     }
12 }

Node results (entities)

1 {
2     "content": "entity_name:\n entity_summary",
3     "metadata": {
4         "source": "graph" | "user_graph",
5         "node_name": "entity_name",
6         "node_attributes": {...},
7         "created_at": "timestamp"
8     }
9 }

Episode results (messages)

1 {
2     "content": "episode_content",
3     "metadata": {
4         "source": "graph" | "user_graph",
5         "episode_type": "source_type",
6         "episode_role": "role_type",
7         "episode_name": "role_name",
8         "created_at": "timestamp"
9     }
10 }

Memory vs tools comparison

Memory Objects (ZepUserMemory/ZepGraphMemory):

Automatic context injection via update_context()
Attached to agent’s memory list
Transparent operation - happens automatically
Better for consistent memory across interactions

Function Tools (search/add tools):

Manual control - agent decides when to use
More explicit and observable operations
Better for specific search/add operations
Works with AutoGen’s tool reflection features
Provides structured return values

Note: Both approaches can be combined - using memory for automatic context and tools for explicit operations.