The Batch API is available to enterprise customers. Contact your Zep account team to enable it for your project.
The Batch API is the recommended way to load large historical datasets — backfills, document collections, archived conversations, migrations from another system — into your Context Graphs.
Calling graph.add or thread.add_messages once per item works for live data but becomes hard to manage at scale. Compared to issuing those calls one at a time, the Batch API gives you:
graph.add and thread.add_messages ingestion serving your agents, so a large backfill can run alongside production.A batch follows a three-step lifecycle:
Items in a batch are grouped by destination graph and processed in the order they were added. Episodes and messages added through the Batch API are priced the same as those added through graph.add or thread.add_messages.
batch.add accepts up to 500 items.To ingest more than 500 items, make multiple batch.add calls against the same batch ID before calling batch.process.
The example below creates a batch, adds a mix of graph episodes and thread messages, starts processing, and polls until the batch finishes.
Each item in a batch is one of two types:
graph_episode — equivalent to a single graph.add call. Targets a graph by graph_id or a user graph by user_id.thread_message — equivalent to one message inside a thread.add_messages call. Targets a thread by thread_id.The fields below mirror the equivalent fields on graph.add and thread.add_messages. See Adding business data and Adding messages for the underlying semantics.
type: "graph_episode")type: "thread_message")Pass created_at on each item to give Zep accurate temporal information for historical data. This is important for backfills — Zep uses these timestamps in its fact invalidation process to determine the valid_at and invalid_at values on extracted facts (edges).
The created_at value should be in RFC3339 format (e.g., "2024-06-15T10:30:00Z").
Two methods report on a running or completed batch:
batch.get(batch_id) returns a summary of the whole batch, including a progress object with counts for total_items, queued_items, processing_items, succeeded_items, failed_items, skipped_items, and percent_complete. Before batch.process is called the batch is in draft and the progress counts are unpopulated; once processing starts the counts begin to update.batch.list_items(batch_id) returns each item with its individual status (pending, queued, processing, succeeded, failed, skipped).When polling batch.get, a few-second interval (e.g., 5 seconds) is appropriate for small batches. For batches with thousands of items or more, polling becomes impractical — subscribe to the ingest.batch.completed webhook instead to be notified when a batch reaches a terminal state. The payload includes the batch_id so you can match it back to the batch you submitted.
The status field on BatchSummary is one of:
Once a batch reaches succeeded, partial, or failed, no further state changes occur. invalid is also non-progressing — the batch never starts processing, but the state persists until you delete the batch. When polling, exit on any of succeeded, partial, failed, or invalid.
The status field on each BatchItemDetail is one of:
Use batch.list to enumerate batches in your project, optionally filtered by status. Use batch.delete to remove a batch that has not yet been processed — once a batch has been processed, it cannot be deleted.
The Zep web dashboard provides a batches view showing all batches in your project, their status, item counts, and processing progress. Click into a batch to inspect its individual items and any errors.
The following methods are deprecated and no longer recommended. Use the Batch API described above for all new ingestion work.
The deprecated methods continue to work but will be removed in a future release.