Thread summaries
Per-thread, incremental summaries of a conversation’s messages
Per-thread, incremental summaries of a conversation’s messages
A thread summary is a natural-language summary of the messages in a single thread, generated and incrementally updated by Zep. There is one summary per thread, and it is persisted on the user’s knowledge graph alongside the data Zep already extracts from messages.
Thread summaries are useful as an extra type of context to give your agent a different view of a user’s history — for example, what problem the user had in a given conversation and how it was resolved. Where facts and entity summaries describe the user across all their threads, a thread summary describes the arc of one specific conversation.
Thread summaries are generated and updated automatically as new messages arrive in a thread. There is no manual “summarize now” call — clients only read summaries, they do not create them.
A thread that has never received messages will not have a summary, and the single-thread endpoint will return 404 until the first summary has been generated.
Use this when you have a specific thread in hand and want its summary.
The last_summarized_at field on the returned object is the timestamp of the most recent summary update. A 404 response means Zep has not yet generated a summary for this thread (for example, a thread with no messages yet).
Use these endpoints to retrieve summaries across many threads — for example, when building a per-user dashboard. Both endpoints return a flat array of ThreadSummary objects and accept an optional pagination body.
graph.search accepts scope="thread_summaries" to search over thread summary content directly. Results are returned in a thread_summaries field on the search response.
Thread summaries may be included in the default Context Block when Smart Context Assembly selects them as relevant to the current conversation. To always include them — or to pin a specific count — use a context template. For full control over which summaries appear and how they’re formatted, retrieve them directly with the SDK methods above, or build a custom block via advanced context block construction.