Autogen memory integration

Add persistent memory to Microsoft Autogen agents using the zep-autogen package.

The zep-autogen package provides seamless integration between Zep and Microsoft Autogen agents. Choose between user-specific conversation memory or structured knowledge graph memory for intelligent context retrieval.

Install dependencies

$pip install zep-autogen zep-cloud autogen-core autogen-agentchat

Environment setup

Set your API keys as environment variables:

$export ZEP_API_KEY="your_zep_api_key"
>export OPENAI_API_KEY="your_openai_api_key"

Memory types

User Memory: Stores conversation history in user threads with automatic context injection
Knowledge Graph Memory: Maintains structured knowledge with custom entity models

User memory

1

Step 1: Setup required imports

1import os
2import uuid
3import asyncio
4from autogen_agentchat.agents import AssistantAgent
5from autogen_ext.models.openai import OpenAIChatCompletionClient
6from autogen_core.memory import MemoryContent, MemoryMimeType
7from zep_cloud.client import AsyncZep
8from zep_autogen import ZepUserMemory
2

Step 2: Initialize client and create user

1# Initialize Zep client
2zep_client = AsyncZep(api_key=os.environ.get("ZEP_API_KEY"))
3user_id = f"user_{uuid.uuid4().hex[:16]}"
4thread_id = f"thread_{uuid.uuid4().hex[:16]}"
5
6# Create user (required before using memory)
7try:
8 await zep_client.user.add(
9 user_id=user_id,
10 email="[email protected]",
11 first_name="Alice"
12 )
13except Exception as e:
14 print(f"User might already exist: {e}")
15
16# Create thread (required for conversation memory)
17try:
18 await zep_client.thread.create(thread_id=thread_id, user_id=user_id)
19except Exception as e:
20 print(f"Thread creation failed: {e}")
3

Step 3: Create memory with configuration

1# Create user memory with configuration
2memory = ZepUserMemory(
3 client=zep_client,
4 user_id=user_id,
5 thread_id=thread_id,
6 thread_context_mode="summary" # "summary" or "basic"
7)
4

Step 4: Create agent with memory

1# Create agent with Zep memory
2agent = AssistantAgent(
3 name="MemoryAwareAssistant",
4 model_client=OpenAIChatCompletionClient(
5 model="gpt-4.1-mini",
6 api_key=os.environ.get("OPENAI_API_KEY")
7 ),
8 memory=[memory],
9 system_message="You are a helpful assistant with persistent memory."
10)
5

Step 5: Store messages and run conversations

1# Helper function to store messages with proper metadata
2async def add_message(message: str, role: str, name: str = None):
3 """Store a message in Zep memory following AutoGen standards."""
4 metadata = {"type": "message", "role": role}
5 if name:
6 metadata["name"] = name
7
8 await memory.add(MemoryContent(
9 content=message,
10 mime_type=MemoryMimeType.TEXT,
11 metadata=metadata
12 ))
13
14# Example conversation with memory persistence
15user_message = "My name is Alice and I love hiking in the mountains."
16print(f"User: {user_message}")
17
18# Store user message
19await add_message(user_message, "user", "Alice")
20
21# Run agent - it will automatically retrieve context via update_context()
22response = await agent.run(task=user_message)
23agent_response = response.messages[-1].content
24print(f"Agent: {agent_response}")
25
26# Store agent response
27await add_message(agent_response, "assistant")

Automatic Context Injection: ZepUserMemory automatically injects relevant conversation history and context via the update_context() method. The agent receives up to 10 recent messages plus summarized context from Zep using the specified thread_context_mode (“basic” or “summary”).

Knowledge graph memory

1

Step 1: Define custom entity models

1from zep_autogen.graph_memory import ZepGraphMemory
2from zep_cloud.external_clients.ontology import EntityModel, EntityText
3from pydantic import Field
4
5# Define entity models using Pydantic
6class ProgrammingLanguage(EntityModel):
7 """A programming language entity."""
8 paradigm: EntityText = Field(
9 description="programming paradigm (e.g., object-oriented, functional)",
10 default=None
11 )
12 use_case: EntityText = Field(
13 description="primary use cases for this language",
14 default=None
15 )
16
17class Framework(EntityModel):
18 """A software framework or library."""
19 language: EntityText = Field(
20 description="the programming language this framework is built for",
21 default=None
22 )
23 purpose: EntityText = Field(
24 description="primary purpose of this framework",
25 default=None
26 )
2

Step 2: Setup graph with ontology

1from zep_cloud import SearchFilters
2
3# Set ontology first
4await zep_client.graph.set_ontology(
5 entities={
6 "ProgrammingLanguage": ProgrammingLanguage,
7 "Framework": Framework,
8 }
9)
10
11# Create graph
12graph_id = f"graph_{uuid.uuid4().hex[:16]}"
13try:
14 await zep_client.graph.create(
15 graph_id=graph_id,
16 name="Programming Knowledge Graph"
17 )
18 print(f"Created graph: {graph_id}")
19except Exception as e:
20 print(f"Graph creation failed: {e}")
3

Step 3: Initialize graph memory with filters

1# Create graph memory with search configuration
2graph_memory = ZepGraphMemory(
3 client=zep_client,
4 graph_id=graph_id,
5 search_filters=SearchFilters(
6 node_labels=["ProgrammingLanguage", "Framework"]
7 ),
8 facts_limit=20, # Max facts in context injection (default: 20)
9 entity_limit=5 # Max entities in context injection (default: 5)
10)
4

Step 4: Add data and wait for indexing

1# Add structured knowledge
2await graph_memory.add(MemoryContent(
3 content="Python is excellent for data science and AI development",
4 mime_type=MemoryMimeType.TEXT,
5 metadata={"type": "data"} # "data" stores in graph, "message" stores as episode
6))
7
8# Wait for graph processing (required)
9print("Waiting for graph indexing...")
10await asyncio.sleep(30) # Allow time for knowledge extraction
11
12<Callout intent="info">
13**Graph Memory Context Injection**: ZepGraphMemory automatically retrieves the last 2 episodes from the graph and uses their content to query for relevant facts (up to `facts_limit`) and entities (up to `entity_limit`). This context is injected as a system message during agent interactions.
14</Callout>
5

Step 5: Create agent with graph memory

1# Create agent with graph memory
2agent = AssistantAgent(
3 name="GraphMemoryAssistant",
4 model_client=OpenAIChatCompletionClient(model="gpt-4.1-mini"),
5 memory=[graph_memory],
6 system_message="You are a technical assistant with programming knowledge."
7)

Tools integration

Zep tools allow agents to search and add data directly to memory storage with manual control and structured responses.

Important: Tools must be bound to either graph_id OR user_id, not both. This determines whether they operate on knowledge graphs or user graphs.

Tool function parameters

Search Tool Parameters:

  • query: str (required) - Search query text
  • limit: int (optional, default 10) - Maximum results to return
  • scope: str (optional, default “edges”) - Search scope: “edges”, “nodes”, “episodes”

Add Tool Parameters:

  • data: str (required) - Content to store
  • data_type: str (optional, default “text”) - Data type: “text”, “json”, “message”

User graph tools

1from zep_autogen import create_search_graph_tool, create_add_graph_data_tool
2
3# Create tools bound to user graph
4search_tool = create_search_graph_tool(zep_client, user_id=user_id)
5add_tool = create_add_graph_data_tool(zep_client, user_id=user_id)
6
7# Agent with user graph tools
8agent = AssistantAgent(
9 name="UserKnowledgeAssistant",
10 model_client=OpenAIChatCompletionClient(model="gpt-4.1-mini"),
11 tools=[search_tool, add_tool],
12 system_message="You can search and add data to the user's knowledge graph.",
13 reflect_on_tool_use=True # Enables tool usage reflection
14)

Knowledge graph tools

1# Create tools bound to knowledge graph
2search_tool = create_search_graph_tool(zep_client, graph_id=graph_id)
3add_tool = create_add_graph_data_tool(zep_client, graph_id=graph_id)
4
5# Agent with knowledge graph tools
6agent = AssistantAgent(
7 name="KnowledgeGraphAssistant",
8 model_client=OpenAIChatCompletionClient(model="gpt-4.1-mini"),
9 tools=[search_tool, add_tool],
10 system_message="You can search and add data to the knowledge graph.",
11 reflect_on_tool_use=True
12)

Query memory

Both memory types support direct querying with different scope parameters.

User memory queries

1# Query user conversation history
2results = await memory.query("What does Alice like?", limit=5)
3
4# Process different result types
5for result in results.results:
6 content = result.content
7 metadata = result.metadata
8
9 if 'edge_name' in metadata:
10 # Fact/relationship result
11 print(f"Fact: {content}")
12 print(f"Relationship: {metadata['edge_name']}")
13 print(f"Valid: {metadata.get('valid_at', 'N/A')} - {metadata.get('invalid_at', 'present')}")
14 elif 'node_name' in metadata:
15 # Entity result
16 print(f"Entity: {metadata['node_name']}")
17 print(f"Summary: {content}")
18 else:
19 # Episode/message result
20 print(f"Message: {content}")
21 print(f"Role: {metadata.get('episode_role', 'unknown')}")
22
23 print(f"Source: {metadata.get('source')}\n")

Graph memory queries

1# Query knowledge graph with scope control
2facts_results = await graph_memory.query(
3 "Python frameworks",
4 limit=10,
5 scope="edges" # "edges" (facts), "nodes" (entities), "episodes" (messages)
6)
7
8print(f"Found {len(facts_results.results)} facts about Python frameworks:")
9for result in facts_results.results:
10 print(f"- {result.content}")
11
12entities_results = await graph_memory.query(
13 "programming languages",
14 limit=5,
15 scope="nodes"
16)
17
18print(f"\nFound {len(entities_results.results)} programming language entities:")
19for result in entities_results.results:
20 entity_name = result.metadata.get('node_name', 'Unknown')
21 print(f"- {entity_name}: {result.content}")

Search result structure

1{
2 "content": "fact text",
3 "metadata": {
4 "source": "graph" | "user_graph",
5 "edge_name": "relationship_name",
6 "edge_attributes": {...},
7 "created_at": "timestamp",
8 "valid_at": "timestamp",
9 "invalid_at": "timestamp",
10 "expired_at": "timestamp"
11 }
12}
1{
2 "content": "entity_name:\n entity_summary",
3 "metadata": {
4 "source": "graph" | "user_graph",
5 "node_name": "entity_name",
6 "node_attributes": {...},
7 "created_at": "timestamp"
8 }
9}
1{
2 "content": "episode_content",
3 "metadata": {
4 "source": "graph" | "user_graph",
5 "episode_type": "source_type",
6 "episode_role": "role_type",
7 "episode_name": "role_name",
8 "created_at": "timestamp"
9 }
10}

Memory vs tools comparison

Memory Objects (ZepUserMemory/ZepGraphMemory):

  • Automatic context injection via update_context()
  • Attached to agent’s memory list
  • Transparent operation - happens automatically
  • Better for consistent memory across interactions

Function Tools (search/add tools):

  • Manual control - agent decides when to use
  • More explicit and observable operations
  • Better for specific search/add operations
  • Works with AutoGen’s tool reflection features
  • Provides structured return values

Note: Both approaches can be combined - using memory for automatic context and tools for explicit operations.