Threads

Overview

Threads represent a conversation. Each User can have multiple threads, and each thread is a sequence of chat messages.

Chat messages are added to threads using thread.add_messages, which both adds those messages to the thread history and ingests those messages into the user-level knowledge graph. The user knowledge graph contains data from all of that userโ€™s threads to create an integrated understanding of the user.

The knowledge graph does not separate the data from different threads, but integrates the data together to create a unified picture of the user. So the get thread user context endpoint and the associated thread.get_user_context method donโ€™t return memory derived only from that thread, but instead return whatever user-level memory is most relevant to that thread, based on the threadโ€™s most recent messages.

Adding a Thread

threadIds are arbitrary identifiers that you can map to relevant business objects in your app, such as users or a conversation a user might have with your app. Before you create a thread, make sure you have created a user first. Then create a thread with:

1client = Zep(
2 api_key=API_KEY,
3)
4thread_id = uuid.uuid4().hex # A new thread identifier
5
6client.thread.create(
7 thread_id=thread_id,
8 user_id=user_id,
9)

Getting Messages of a Thread

1messages = client.thread.get(thread_id)

Deleting a Thread

Deleting a thread deletes it and its associated messages. It does not however delete the associated data in the userโ€™s knowledge graph. To remove data from the graph, see deleting data from the graph.

1client.thread.delete(thread_id)

Listing Threads

You can list all Threads in the Zep Memory Store with page_size and page_number parameters for pagination.

1# List the first 10 Threads
2result = client.thread.list_all(page_size=10, page_number=1)
3for thread in result.threads:
4 print(thread)

Automatic Cache Warming

When you create a new thread, Zep automatically warms the cache for that userโ€™s graph data in the background. This optimization improves query latency for graph operations on newly created threads by pre-loading the userโ€™s data into the hot cache tier.

The warming operation runs asynchronously and does not block the thread creation response. No additional action is required on your partโ€”this happens automatically whenever you create a thread for a user with an existing graph.

For more information about Zepโ€™s multi-tier caching architecture and manual cache warming, see Warming the User Cache.