Detecting Patterns | Zep Documentation

Experimental API

Pattern detection is an experimental feature. The API may change in future releases.

Introduction

Zep’s pattern detection analyzes the structure of your knowledge graph to discover recurring patterns: frequent relationship types, multi-hop paths, co-occurring entities, highly connected hubs, and tightly interconnected clusters. Unlike graph search, which retrieves content matching a query, pattern detection reveals the shape of your data — surfacing structural insights that aren’t visible from individual nodes or edges.

Pattern detection supports two modes:

Seed mode: Provide explicit seed nodes, labels, or edge types to analyze patterns around specific parts of the graph
Query mode: Provide a natural-language query to automatically discover relevant nodes and return relationship patterns with relevance-scored edges

What It Finds

Pattern Type	What It Detects	Example
Relationship	Common `(source_label, edge_type, target_label)` triples	`Person -[WORKS_AT]-> Company` appears 47 times
Path	Frequent multi-hop connection chains	`Person -> Project -> Technology` is a recurring 2-hop path
Co-occurrence	Node types that appear together within k hops	`Decision` and `Stakeholder` nodes consistently co-occur
Hub	Highly connected nodes (star topology)	A `Project` node with 12 `ASSIGNED_TO` edges
Cluster	Tightly interconnected groups (triangle topology)	Three `Person` nodes all connected to each other

Query mode only detects relationship patterns. Use seed mode for path, co-occurrence, hub, and cluster detection.

Use Cases

Knowledge graph auditing: Understand what types of information your graph captures most frequently
Schema discovery: Identify dominant relationship patterns to inform ontology design
Anomaly context: Establish baselines of normal graph structure to help detect anomalies
Data quality: Find unexpected patterns that may indicate ingestion issues
Agent-driven Q&A: Use query mode to find relevant graph context for answering questions (e.g., “clothing purchases” returns scored edge facts about purchase patterns)

Basic Usage

Seed mode

Provide seeds to focus analysis around specific nodes, node labels, or edge types. At least one seed field (node_uuids, node_labels, or edge_types) is required.

1 from zep_cloud import Zep
2 
3 client = Zep(
4     api_key=API_KEY,
5 )
6 
7 result = client.graph.detect_patterns(
8     user_id="alice",
9     seeds={
10         "node_labels": ["Decision"],
11     },
12 )
13 
14 for pattern in result.patterns:
15     print(f"{pattern.type}: {pattern.description} ({pattern.occurrences}x)")

Query mode

Provide a query string to let Zep automatically discover relevant nodes and detect relationship patterns. The response includes relevance-scored edges per pattern and a deduplicated nodes array.

1 result = client.graph.detect_patterns(
2     user_id="alice",
3     query="clothing purchases",
4 )
5 
6 for pattern in result.patterns:
7     print(f"{pattern.description} ({pattern.occurrences}x)")
8     for edge in pattern.edges:
9         print(f"  {edge.fact} (score: {edge.score:.2f})")
10 
11 # Deduplicated nodes referenced by pattern edges
12 for node in result.nodes:
13     print(f"  Node: {node.name} ({node.labels})")

Configurable Parameters

Parameter	Type	Description	Default	Required
`user_id`	string	Detect patterns in a user’s graph	-	Yes*
`graph_id`	string	Detect patterns in a named graph	-	Yes*
`query`	string	Natural-language search query for discovering relevant nodes automatically. Only detects relationship patterns.	-	Yes**
`query_limit`	integer	Maximum nodes to discover from the query (1-50). Only used with `query`.	`10`	No
`edge_limit`	integer	Maximum resolved edges per pattern (1-100). Only used with `query`.	`10`	No
`seeds`	object	Seed nodes to focus analysis around (at least one of `node_uuids`, `node_labels`, or `edge_types` must be provided).	-	Yes**
`detect`	object	Which pattern types to detect and their configuration. Ignored when `query` is set.	all types	No
`limit`	integer	Maximum patterns to return (1-200)	`50`	No
`min_occurrences`	integer	Minimum occurrence count to report a pattern	`2`	No
`recency_weight`	string	Temporal decay half-life: `"none"`, `"7_days"`, `"30_days"`, `"90_days"`	`"none"`	No
`search_filters`	object	Filter which edges/nodes participate (same format as graph search). Also applied to seed node discovery in query mode.	-	No

*Either user_id or graph_id is required

Either query or seeds is required, but not both

Selecting Pattern Types

Use the detect parameter to choose which pattern types to find. Each key enables that type; its value provides type-specific configuration. Omit detect entirely to run all types with defaults.

1 result = client.graph.detect_patterns(
2     user_id="alice",
3     seeds={
4         "node_labels": ["Person"],
5     },
6     detect={
7         "relationships": {},
8         "paths": {"max_hops": 4},
9         "hubs": {"min_degree": 5},
10     },
11 )

Type-Specific Configuration

Pattern Type	Config Field	Type	Description	Default
`paths`	`max_hops`	integer (1-5)	Maximum path length to search	3
`co_occurrences`	`max_hops`	integer (1-5)	Proximity window for co-occurrence	3
`hubs`	`min_degree`	integer (2+)	Minimum connections to qualify as a hub	3
`relationships`	(none)	-	-	-
`clusters`	(none)	-	-	-

Seed Nodes

Use seeds to specify the starting points for pattern detection in seed mode. At least one seed field is required. When multiple seed fields are provided, seeds are combined (union). Seeds cannot be used together with query.

1 # Focus on patterns around Decision nodes
2 result = client.graph.detect_patterns(
3     user_id="alice",
4     seeds={
5         "node_labels": ["Decision"],
6     },
7     detect={
8         "relationships": {},
9         "paths": {"max_hops": 3},
10     },
11 )

Seed Options

Field	Type	Description
`node_uuids`	array of strings	Specific node UUIDs to analyze around
`node_labels`	array of strings	All nodes with these labels become seeds (e.g., `["Decision", "Person"]`)
`edge_types`	array of strings	All endpoints of these edge types become seeds (e.g., `["CHOSE", "REJECTED"]`)

Recency Weighting

Apply temporal decay to favor recently created edges. The recency_weight value sets the exponential decay half-life applied to each edge’s created_at timestamp.

Value	Effect
`"none"`	All edges weighted equally (default)
`"7_days"`	Edges lose half their weight every 7 days
`"30_days"`	Edges lose half their weight every 30 days
`"90_days"`	Edges lose half their weight every 90 days

When recency weighting is enabled, the weighted_score in each result reflects the decayed sum, while occurrences always reports the raw unweighted count.

1 result = client.graph.detect_patterns(
2     user_id="alice",
3     seeds={
4         "node_labels": ["Decision"],
5     },
6     recency_weight="30_days",
7 )
8 
9 for pattern in result.patterns:
10     print(f"{pattern.description}: {pattern.occurrences} occurrences, weighted score {pattern.weighted_score:.1f}")

Use search_filters to restrict which nodes and edges participate in pattern detection. This uses the same filter format as graph search.

1 result = client.graph.detect_patterns(
2     user_id="alice",
3     seeds={
4         "node_labels": ["Decision"],
5     },
6     search_filters={
7         "node_labels": ["Decision", "Person"],
8         "edge_types": ["CHOSE", "REJECTED"],
9         "created_at": [[{">": "2025-01-01T00:00:00Z"}]],
10     },
11 )

Response Structure

Each pattern in the response contains:

Field	Type	Description
`type`	string	Pattern type: `"relationship"`, `"path"`, `"co_occurrence"`, `"hub"`, or `"cluster"`
`description`	string	Human-readable description (e.g., `"Person -[WORKS_AT]-> Company"`)
`occurrences`	integer	Raw count of this pattern in the graph (always unweighted)
`weighted_score`	float	Weighted sum after recency decay (equals `occurrences` when `recency_weight` is `"none"`)
`node_labels`	array	Node labels involved in the pattern structure
`edge_types`	array	Edge types involved in the pattern structure
`edges`	array	Resolved edges sorted by relevance score (only populated in query mode)

The top-level response also includes:

Field	Type	Description
`nodes`	array	Deduplicated nodes referenced by pattern edges (only populated in query mode)
`metadata`	object	Statistics about the detection run

The metadata object contains:

Field	Type	Description
`nodes_analyzed`	integer	Number of unique nodes analyzed
`edges_analyzed`	integer	Number of edges analyzed
`elapsed_ms`	integer	Server-side processing time in milliseconds

Patterns are sorted by weighted_score in descending order.