Adding batch data
The batch add method enables efficient concurrent processing of large amounts of data to your graph. This experimental feature is designed for scenarios where you need to add multiple episodes quickly, such as backfills, document collections, or historical data imports.
This is an experimental feature. While faster than sequential processing, batch ingestion may result in slightly different graph structure compared to sequential processing due to the concurrent nature of the operation.
How batch processing works
The batch add method processes episodes concurrently for improved performance while still preserving temporal relationships between episodes. Unlike sequential processing where episodes are handled one at a time, batch processing can handle up to 20 episodes simultaneously.
The batch method works with data with a temporal dimension such as evolving chat histories and can process up to 20 episodes at a time of mixed types (text, json, message).
When to use batch processing
Batch processing is ideal for:
- Historical data backfills
- Document collection imports
- Large datasets where processing speed is prioritized
- Data with a temporal dimension
Batch processing works for all types of data, including data with a temporal dimension such as evolving chat histories.
Usage example
Important details
- Maximum of 20 episodes per batch
- Episodes can be of mixed types (text, json, message)
- As an experimental feature, may produce slightly different graph structure compared to sequential processing
- Each episode still respects the 10,000 character limit
Data size and chunking
The same data size limits apply to batch processing as sequential processing. Each episode in the batch is limited to 10,000 characters. For larger documents, chunk them into smaller episodes before adding to the batch.
For chunking strategies and best practices, see the data size limit and chunking section in the main adding data guide.