Google ADK integration

Add persistent context and knowledge graphs to Google ADK agents

Google’s Agent Development Kit (ADK) agents equipped with Zep’s context layer can maintain context across conversations and access personalized knowledge graphs. The zep-adk package provides real-time message persistence and automatic context injection for ADK agents, and ships for Python, TypeScript, and Go.

Core benefits

  • Zero restructuring: Add Zep to an existing ADK agent without changing your agent architecture
  • Shared-agent architecture: One Agent definition serves all users. Per-user identity is resolved at runtime from ADK session state
  • Real-time persistence: Both user and assistant messages are persisted to Zep on every turn, not batched at session end
  • Automatic context injection: Zep’s context block — facts, relationships, and prior knowledge — is injected into the LLM prompt before each response
  • Lazy resource creation: Zep users and threads are created automatically on first use

How it works

The integration hooks into ADK’s agent lifecycle to persist the user’s message and inject relevant context before each model call, then persist the assistant’s reply afterward. Each language exposes the same loop through its idiomatic ADK extension points:

LanguageContext injection (per turn)Assistant persistence
PythonZepContextTool — a BaseTool that overrides process_llm_request() (the hook ADK’s own PreloadMemoryTool uses); never called by the model directlycreate_after_model_callback
TypeScriptcreateZepBeforeModelCallback (a beforeModelCallback), or ZepContextTool as a tool-centric alternativecreateZepAfterModelCallback
GoNewBeforeModelCallback (a BeforeModelCallback)NewAfterModelCallback

On each turn the context hook resolves the user’s Zep identity, persists the user’s message, retrieves the relevant context block, and injects it into the model’s system instruction. Tool-loop continuations are skipped, so a turn is recorded in Zep exactly once. The Go integration additionally provides an ADK memory.Service (NewMemoryService) that tools reach through ToolContext.SearchMemory.

Per-user setup (on_user_created) and the custom context builder below are Python-only — the TypeScript and Go packages don’t currently expose these hooks. The general callback and tool options (display names, ignore_roles, logger, search parameters) do have TypeScript constructor-option and Go With... equivalents; see the TypeScript and Go package READMEs for the exact signatures. The backfill strategy is a standalone script using the Zep SDK directly and applies to any language.

What gets persisted

Only the user’s message and the model’s final response are persisted to Zep on each turn. Intermediate model outputs — such as “thinking” text emitted alongside a tool call (e.g. “Let me look that up for you.”) — are not persisted. Tool calls and tool results are also excluded. This keeps the Zep thread clean: one user message and one assistant message per turn, reflecting the actual conversation rather than internal agent mechanics.

If the user message contains multiple text parts (e.g. text alongside an image), all text parts are joined. Non-text parts (images, files) are ignored — only text is sent to Zep.

This approach does not use ADK’s BaseMemoryService abstraction. Zep’s real-time, per-message memory model doesn’t fit ADK’s batch-at-session-end pattern. See Why not BaseMemoryService? for details.

Installation

$pip install zep-adk

Requires a Zep Cloud API key — get yours from app.getzep.com — plus the ADK runtime for your language: Python 3.11+ with google-adk>=1.0.0, Node.js 20+ with @google/adk (built against 1.2.0), or Go 1.25+ with google.golang.org/adk v1.4.0. The Go package is imported as zepadk "github.com/getzep/zep/integrations/adk/go".

Set up your Zep API key and Gemini API key:

$export ZEP_API_KEY="your-zep-api-key"
$export GEMINI_API_KEY="your-gemini-api-key"

Adding Zep to an agent

Whether you’re building a new agent or adding Zep to an existing one, the setup is the same: wire up the context hook and the after-model callback. The Python example below shows the full runner flow; the TypeScript and Go tabs show the equivalent agent wiring.

1import os
2from uuid import uuid4
3
4from google.adk.agents import Agent
5from google.adk.runners import Runner
6from google.adk.sessions import InMemorySessionService
7from google.genai import types
8from zep_cloud.client import AsyncZep
9from zep_adk import ZepContextTool, create_after_model_callback
10
11zep = AsyncZep(api_key=os.environ["ZEP_API_KEY"])
12
13# One shared agent definition — serves all users
14agent = Agent(
15 name="my_agent",
16 model="gemini-2.5-flash",
17 instruction="You are a helpful assistant with long-term memory.",
18 tools=[
19 # Your existing tools stay as-is.
20 ZepContextTool(
21 zep_client=zep,
22 ignore_roles=["assistant"], # optional — excludes assistant messages from graph ingestion
23 ),
24 ],
25 after_model_callback=create_after_model_callback(
26 zep_client=zep,
27 assistant_name="my_agent", # name shown in Zep (default: "Assistant")
28 ignore_roles=["assistant"], # optional — matches ZepContextTool setting
29 ),
30)
31
32session_service = InMemorySessionService()
33runner = Runner(agent=agent, app_name="my_app", session_service=session_service)
34
35# Per-user session — identity maps automatically to Zep:
36# user_id → Zep user ID
37# session_id → Zep thread ID (Zep's knowledge graph spans all threads for a user)
38session_id = f"session-{uuid4().hex[:8]}"
39await session_service.create_session(
40 app_name="my_app",
41 user_id="user-123",
42 session_id=session_id,
43 state={
44 "zep_first_name": "Jane", # recommended — anchors the identity node
45 "zep_last_name": "Smith", # optional (default: "User")
46 "zep_email": "[email protected]", # optional
47 },
48)
49
50# Send a message — persistence and context injection happen automatically
51content = types.Content(role="user", parts=[types.Part(text="Hi, I work at Acme Corp.")])
52async for event in runner.run_async(
53 user_id="user-123", session_id=session_id, new_message=content
54):
55 if event.is_final_response() and event.content:
56 print(event.content.parts[0].text)

That’s it. Every user message is persisted to Zep, relevant context is injected into the LLM prompt, and assistant responses are captured — all automatically.

The ignore_roles parameter shown above excludes specific message roles from graph ingestion while still storing them in the thread history. This is useful when assistant messages don’t add meaningful knowledge to the graph — they’re preserved for conversation context but don’t create nodes or edges. Both ZepContextTool and create_after_model_callback accept ignore_roles. See Ignore assistant messages in the Zep docs for more detail.

Identity and session state

The integration maps ADK session metadata to Zep automatically: user_id becomes the Zep user ID, and session_id becomes the Zep thread ID. Zep’s knowledge graph is per-user, not per-thread — it accumulates knowledge across all of a user’s conversations, so when they start a new session they get context from everything Zep has learned about them.

The following session state keys are recognized (all optional):

KeyDefaultDescription
zep_first_name"Anonymous"User’s first name. Anchors the identity node in the knowledge graph.
zep_last_name"User"User’s last name.
zep_emailNoneUser’s email address.
zep_user_idADK user_idOverride if the Zep user ID differs from the ADK user ID.
zep_thread_idADK session_idOverride if the Zep thread ID differs from the ADK session ID.

Identity resolves by precedence: explicit construction options (userId/threadId) take precedence over the zep_user_id/zep_thread_id session-state keys, which in turn take precedence over the ADK user_id/session_id.

Advanced usage

Per-user setup

Python only — the TypeScript and Go packages don’t currently expose an on_user_created hook.

When the integration creates a new Zep user for the first time, you can run a setup hook to configure per-user resources — such as a custom ontology, custom extraction instructions, or user summary instructions. Pass an on_user_created callback to ZepContextTool:

1from pydantic import Field
2from zep_cloud import CustomInstruction
3from zep_cloud.client import AsyncZep
4from zep_cloud.external_clients.ontology import EntityModel, EntityText
5from zep_cloud.types import UserInstruction
6from zep_adk import ZepContextTool
7
8class Company(EntityModel):
9 """A company or organization the user is associated with."""
10 industry: EntityText = Field(description="The company's industry", default=None)
11
12async def setup_user(zep_client: AsyncZep, user_id: str) -> None:
13 """Runs once when a new Zep user is created."""
14 # Set a custom ontology for this user's knowledge graph
15 await zep_client.graph.set_ontology(
16 entities={"Company": Company},
17 user_ids=[user_id],
18 )
19
20 # Add custom extraction instructions
21 await zep_client.graph.add_custom_instructions(
22 user_ids=[user_id],
23 instructions=[
24 CustomInstruction(
25 name="purchase_intent",
26 text="Extract product preferences and purchase intent.",
27 )
28 ],
29 )
30
31 # Configure how user summaries are generated
32 await zep_client.user.add_user_summary_instructions(
33 user_ids=[user_id],
34 instructions=[
35 UserInstruction(
36 name="work_focus",
37 text="Focus on the user's role, team, and active projects.",
38 )
39 ],
40 )
41
42ZepContextTool(zep_client=zep, on_user_created=setup_user)

The hook fires only when the user is genuinely new — not for users that already exist. If the hook raises an exception, a warning is logged but the agent turn continues normally.

See custom ontology, custom instructions, and user summary instructions for details on each API.

Custom context builder

Python only — the TypeScript and Go packages don’t currently expose a custom context builder.

By default, the integration uses thread.add_messages(return_context=True) — a single API call that persists the message and retrieves context. This works well for most use cases.

For advanced scenarios — multi-graph searches, custom context templates, or combining multiple Zep API calls — you can provide a context_builder. When set, message persistence and context building run in parallel for lower latency.

1import asyncio
2from zep_cloud.client import AsyncZep
3from zep_adk import ZepContextTool, ContextBuilder
4
5async def my_context_builder(
6 zep_client: AsyncZep,
7 user_id: str,
8 thread_id: str,
9 user_message: str,
10) -> str | None:
11 """Custom context: combine user context with a targeted graph search."""
12 user_context, search_results = await asyncio.gather(
13 zep_client.thread.get_user_context(thread_id),
14 zep_client.graph.search(
15 user_id=user_id,
16 query=user_message,
17 scope="edges",
18 limit=10,
19 ),
20 )
21
22 parts = []
23 if user_context and user_context.context:
24 parts.append(user_context.context)
25 if search_results and search_results.edges:
26 facts = [e.fact for e in search_results.edges if e.fact]
27 if facts:
28 parts.append("Additional facts:\n" + "\n".join(f"- {f}" for f in facts))
29
30 return "\n\n".join(parts) if parts else None
31
32# Pass it to ZepContextTool
33tool = ZepContextTool(zep_client=zep, context_builder=my_context_builder)

The ContextBuilder and UserSetupHook type signatures (both importable from zep_adk):

1ContextBuilder = Callable[[AsyncZep, str, str, str], Awaitable[str | None]]
2# client user thread message
3
4UserSetupHook = Callable[[AsyncZep, str], Awaitable[None]]
5# client user_id

See advanced context block construction and context templates for more on assembling custom context.

Graph search tool

ZepContextTool injects context automatically on every turn. For cases where the model needs to actively search the knowledge graph — e.g. looking up specific facts, entities, or prior messages — you can add ZepGraphSearchTool. This is a model-callable tool: the model sees it in its tool list and decides when to invoke it.

1from zep_adk import ZepContextTool, ZepGraphSearchTool, create_after_model_callback
2
3agent = Agent(
4 name="my_agent",
5 model="gemini-2.5-flash",
6 instruction="...",
7 tools=[
8 ZepContextTool(zep_client=zep), # automatic context every turn
9 ZepGraphSearchTool(zep_client=zep), # on-demand search
10 ],
11 after_model_callback=create_after_model_callback(zep_client=zep),
12)

The tool automatically resolves the user identity from session state, so the model only needs to provide a search query. The model can also optionally choose the scope (edges, nodes, episodes), reranker, limit, and other search parameters.

Pinning parameters

Any search parameter can be locked at construction time. Pinned parameters are hidden from the model’s schema — it can’t override them.

1# Model can only control query and scope — everything else is locked
2ZepGraphSearchTool(
3 zep_client=zep,
4 reranker="cross_encoder",
5 limit=5,
6 search_filters={"node_labels": ["Person"]},
7 bfs_origin_node_uuids=["node-uuid-1"], # seed BFS traversal from specific nodes
8)

Shared documentation graph

To search a fixed graph that all users share (e.g. a documentation knowledge base), pass graph_id. The tool will search that graph instead of the current user’s personal graph. Use distinct name and description values when combining multiple instances:

1agent = Agent(
2 name="my_agent",
3 model="gemini-2.5-flash",
4 instruction="...",
5 tools=[
6 ZepContextTool(zep_client=zep),
7 ZepGraphSearchTool(
8 zep_client=zep,
9 name="search_user_memory",
10 description="Search the user's knowledge graph for information from previous conversations, known facts, or general context about the user.",
11 ),
12 ZepGraphSearchTool(
13 zep_client=zep,
14 name="search_docs",
15 description="Search the shared documentation knowledge base.",
16 graph_id="docs-graph-123",
17 ),
18 ],
19 after_model_callback=create_after_model_callback(zep_client=zep),
20)

The model sees two distinct tools and chooses which to call based on the user’s query.

Backfill strategy for existing users

If you have existing users with conversation history, you can backfill their data into Zep so they get rich context from day one.

ID matching

Use the same user IDs and thread IDs. The backfill script must create Zep users and threads with the exact same IDs used in ADK:

  • User IDs must match what you pass as user_id to ADK’s create_session(). This links live sessions to the correct knowledge graph. Mismatched user IDs mean backfilled history is orphaned.
  • Thread IDs must match the ADK session_id for each conversation. If a user continues an existing session after cutover, the integration uses that session ID as the Zep thread ID. If the backfill used a different thread ID, the conversation history is split — the continued thread won’t see the backfilled messages in its thread context.

Example backfill script

This runs outside of ADK as a standalone script using the Zep Python SDK directly:

1import asyncio
2from zep_cloud.client import AsyncZep
3from zep_cloud import Message
4
5zep = AsyncZep(api_key="your-zep-api-key")
6
7async def backfill_user(
8 user_id: str, # must match ADK create_session() user_id
9 first_name: str,
10 last_name: str,
11 conversations: list[dict], # list of {session_id, messages} dicts
12):
13 # 1. Create the user
14 try:
15 await zep.user.add(user_id=user_id, first_name=first_name, last_name=last_name)
16 except Exception as e:
17 if "already exists" not in str(e).lower():
18 raise
19
20 # 2. Load each conversation — use the original ADK session ID as the Zep thread ID
21 for convo in conversations:
22 thread_id = convo["session_id"] # must match ADK session_id
23 try:
24 await zep.thread.create(thread_id=thread_id, user_id=user_id)
25 except Exception as e:
26 if "already exists" in str(e).lower():
27 continue
28 raise
29
30 messages = [
31 Message(
32 role=msg["role"],
33 content=msg["content"],
34 name=f"{first_name} {last_name}" if msg["role"] == "user" else "Assistant",
35 )
36 for msg in convo["messages"]
37 ]
38 await zep.thread.add_messages(thread_id=thread_id, messages=messages)
39
40 print(f"Backfilled {len(conversations)} conversations for {user_id}")
41
42async def main():
43 users = [
44 {
45 "user_id": "user-123", # same ID used in ADK sessions
46 "first_name": "Jane",
47 "last_name": "Smith",
48 "conversations": [
49 {
50 "session_id": "session-abc", # original ADK session ID
51 "messages": [
52 {"role": "user", "content": "I need help with my account settings."},
53 {"role": "assistant", "content": "I can help. What would you like to change?"},
54 {"role": "user", "content": "I want to enable two-factor authentication."},
55 {"role": "assistant", "content": "Go to Settings > Security > 2FA to enable it."},
56 ],
57 },
58 ],
59 },
60 ]
61 for user in users:
62 await backfill_user(**user)
63
64asyncio.run(main())

After backfilling, allow time for Zep to process the messages and build knowledge graphs. Zep processes messages asynchronously — the graph won’t be available instantly. For large backfills, add delays between users to avoid rate limits.

Transition gap

There is a window between when the backfill runs and when the Zep-integrated agent goes live. Any messages sent to existing threads during this window won’t be in Zep. For most use cases this is acceptable — the knowledge graph catches up quickly once the agent is live. But if thread-level continuity is critical, consider a dual-write period: after the backfill completes but before full cutover, have your application write new messages to Zep (via the SDK directly) alongside the existing system. This ensures no messages are missed in the transition.

Cutover checklist

  1. Run the backfill script — Zep now has knowledge graphs for existing users
  2. Update your agent — Add ZepContextTool and the after-model callback
  3. Add session state keys — Include zep_first_name and zep_last_name in create_session() calls
  4. Deploy — Existing users get rich context from their first message; new users build context over time

Why not BaseMemoryService?

The Python and TypeScript integrations inject context through the lifecycle hooks above rather than through ADK’s BaseMemoryService. (The Go integration additionally exposes an ADK memory.Service via NewMemoryService for tool-driven search_memory calls, but context injection still happens through the before-model callback.)

ADK’s BaseMemoryService abstraction has two methods: add_session_to_memory() (called at session end) and search_memory() (called when the agent queries memory). For automatic context injection this is a poor fit for Zep:

  • Real-time, not batch: Zep persists messages immediately on every turn. Batching at session end would delay knowledge graph updates and prevent the agent from using newly extracted facts mid-conversation.
  • Thread-based context, not keyword search: ADK’s search_memory() passes a query string. Zep’s context retrieval takes a thread ID and returns a pre-assembled context block built from conversation history, extracted facts, and the user’s knowledge graph.

Next steps