Pydantic AI integration
Add long-term agent memory to Pydantic AI agents
Pydantic AI agents using Zep gain long-term memory backed by a temporal knowledge graph. The zep-pydantic-ai package persists each user turn, injects relevant context into the model prompt using Pydantic AI’s native ProcessHistory capability, and adds an on-demand graph-search tool.
Core benefits
- Native
ProcessHistorycapability: Uses the current Pydantic AI history-processor hook, not a deprecated kwarg - Single round-trip: Persists the user turn and retrieves context in one
add_messagescall - Correct under tool calls: Dedupes per run (keyed by the run ID), so a run that makes tool calls records the turn exactly once
- On-demand graph search: A model-callable tool over
graph.searchfor explicit lookups - Lazy resource creation: The Zep user and thread are created on first use
- Graceful degradation: A Zep failure is logged but never crashes the agent run
How it works
The integration plugs into Pydantic AI through three components:
ZepDeps— a dataclass used as the agent’sdeps_type. It carries the Zep client and the user/thread identity. Construct one per conversation and pass it toagent.run(..., deps=deps); both the history processor and the search tool reach it throughRunContext.deps.zep_history_processor— registered viacapabilities=[ProcessHistory(zep_history_processor)]. Pydantic AI runs it before every model request: on the user’s turn it persists the latest message viathread.add_messages(return_context=True)and prepends Zep’s context block as a system message. BecauseProcessHistoryfires once per model request (not once per run), the processor dedupes per run, keyed by the run ID. It persists and retrieves only on the first request of a run and replays the cached context on later requests within that same run, so tool-calling runs never create duplicate episodes.create_zep_search_tool— a factory returning a model-callable tool overgraph.search. The model decides when to search the knowledge graph; search parameters are pinned at construction.
Call persist_run after agent.run to persist the assistant’s reply. The history processor runs before each model request, so the assistant’s reply does not exist yet when it fires; persist_run writes that reply once the run completes. Only assistant text is sent, so Zep records one clean assistant message per turn.
Installation
Requires Python 3.11+, pydantic-ai>=1.107,<2, and a Zep Cloud API key. Get your API key from app.getzep.com.
Set up your environment variables:
Usage
Register the history processor and search tool when building the agent, then pass ZepDeps to each run:
On-demand graph search
Beyond the automatic context injection, create_zep_search_tool() adds a model-callable tool over graph.search. The model decides when to look up specific facts, entities, or prior episodes; it supplies only the query, while scope, reranker, and limit are pinned at construction. The tool returns a formatted text summary of the matching results. By default it searches the current user’s graph; pass graph_id=... to target a shared standalone graph.
Memory vs tools
The integration combines two retrieval paths on the same agent:
Injection grounds each turn with cross-session context; the search tool lets the model actively dig for specific details.
Configuration options
ZepDeps
create_zep_search_tool
Best practices
- Construct one
ZepDepsper conversation and reuse a singleAsyncZepclient across runs - Pass real names so Zep can anchor the user’s identity node in the graph
- Always call
persist_runafter a run so the assistant’s reply reaches the graph - Allow time for indexing — Zep extracts knowledge asynchronously, so facts from a turn are not instantly searchable
Next steps
- Explore customizing graph structure for advanced knowledge organization
- Learn about searching the graph and how to tune search
- See code examples for additional patterns