Detecting Patterns

Analyze the structure of a knowledge graph to discover recurring relationship patterns, paths, hubs, and clusters
Experimental API

Pattern detection is an experimental feature. The API may change in future releases.

Introduction

Zep’s pattern detection analyzes the structure of your knowledge graph to discover recurring patterns: frequent relationship types, multi-hop paths, co-occurring entities, highly connected hubs, and tightly interconnected clusters. Unlike graph search, which retrieves content matching a query, pattern detection reveals the shape of your data — surfacing structural insights that aren’t visible from individual nodes or edges.

Pattern detection supports two modes:

  • Seed mode: Provide explicit seed nodes, labels, or edge types to analyze patterns around specific parts of the graph
  • Query mode: Provide a natural-language query to automatically discover relevant nodes and return relationship patterns with relevance-scored edges

What It Finds

Pattern TypeWhat It DetectsExample
RelationshipCommon (source_label, edge_type, target_label) triplesPerson -[WORKS_AT]-> Company appears 47 times
PathFrequent multi-hop connection chainsPerson -> Project -> Technology is a recurring 2-hop path
Co-occurrenceNode types that appear together within k hopsDecision and Stakeholder nodes consistently co-occur
HubHighly connected nodes (star topology)A Project node with 12 ASSIGNED_TO edges
ClusterTightly interconnected groups (triangle topology)Three Person nodes all connected to each other

Query mode only detects relationship patterns. Use seed mode for path, co-occurrence, hub, and cluster detection.

Use Cases

  • Knowledge graph auditing: Understand what types of information your graph captures most frequently
  • Schema discovery: Identify dominant relationship patterns to inform ontology design
  • Anomaly context: Establish baselines of normal graph structure to help detect anomalies
  • Data quality: Find unexpected patterns that may indicate ingestion issues
  • Agent-driven Q&A: Use query mode to find relevant graph context for answering questions (e.g., “clothing purchases” returns scored edge facts about purchase patterns)

Basic Usage

Seed mode

Provide seeds to focus analysis around specific nodes, node labels, or edge types. At least one seed field (node_uuids, node_labels, or edge_types) is required.

1from zep_cloud import Zep
2
3client = Zep(
4 api_key=API_KEY,
5)
6
7result = client.graph.detect_patterns(
8 user_id="alice",
9 seeds={
10 "node_labels": ["Decision"],
11 },
12)
13
14for pattern in result.patterns:
15 print(f"{pattern.type}: {pattern.description} ({pattern.occurrences}x)")

Query mode

Provide a query string to let Zep automatically discover relevant nodes and detect relationship patterns. The response includes relevance-scored edges per pattern and a deduplicated nodes array.

1result = client.graph.detect_patterns(
2 user_id="alice",
3 query="clothing purchases",
4)
5
6for pattern in result.patterns:
7 print(f"{pattern.description} ({pattern.occurrences}x)")
8 for edge in pattern.edges:
9 print(f" {edge.fact} (score: {edge.score:.2f})")
10
11# Deduplicated nodes referenced by pattern edges
12for node in result.nodes:
13 print(f" Node: {node.name} ({node.labels})")

Configurable Parameters

ParameterTypeDescriptionDefaultRequired
user_idstringDetect patterns in a user’s graph-Yes*
graph_idstringDetect patterns in a named graph-Yes*
querystringNatural-language search query for discovering relevant nodes automatically. Only detects relationship patterns.-Yes**
query_limitintegerMaximum nodes to discover from the query (1-50). Only used with query.10No
edge_limitintegerMaximum resolved edges per pattern (1-100). Only used with query.10No
seedsobjectSeed nodes to focus analysis around (at least one of node_uuids, node_labels, or edge_types must be provided).-Yes**
detectobjectWhich pattern types to detect and their configuration. Ignored when query is set.all typesNo
limitintegerMaximum patterns to return (1-200)50No
min_occurrencesintegerMinimum occurrence count to report a pattern2No
recency_weightstringTemporal decay half-life: "none", "7_days", "30_days", "90_days""none"No
search_filtersobjectFilter which edges/nodes participate (same format as graph search). Also applied to seed node discovery in query mode.-No

*Either user_id or graph_id is required

Either query or seeds is required, but not both

Selecting Pattern Types

Use the detect parameter to choose which pattern types to find. Each key enables that type; its value provides type-specific configuration. Omit detect entirely to run all types with defaults.

1result = client.graph.detect_patterns(
2 user_id="alice",
3 seeds={
4 "node_labels": ["Person"],
5 },
6 detect={
7 "relationships": {},
8 "paths": {"max_hops": 4},
9 "hubs": {"min_degree": 5},
10 },
11)

Type-Specific Configuration

Pattern TypeConfig FieldTypeDescriptionDefault
pathsmax_hopsinteger (1-5)Maximum path length to search3
co_occurrencesmax_hopsinteger (1-5)Proximity window for co-occurrence3
hubsmin_degreeinteger (2+)Minimum connections to qualify as a hub3
relationships(none)---
clusters(none)---

Seed Nodes

Use seeds to specify the starting points for pattern detection in seed mode. At least one seed field is required. When multiple seed fields are provided, seeds are combined (union). Seeds cannot be used together with query.

1# Focus on patterns around Decision nodes
2result = client.graph.detect_patterns(
3 user_id="alice",
4 seeds={
5 "node_labels": ["Decision"],
6 },
7 detect={
8 "relationships": {},
9 "paths": {"max_hops": 3},
10 },
11)

Seed Options

FieldTypeDescription
node_uuidsarray of stringsSpecific node UUIDs to analyze around
node_labelsarray of stringsAll nodes with these labels become seeds (e.g., ["Decision", "Person"])
edge_typesarray of stringsAll endpoints of these edge types become seeds (e.g., ["CHOSE", "REJECTED"])

Recency Weighting

Apply temporal decay to favor recently created edges. The recency_weight value sets the exponential decay half-life applied to each edge’s created_at timestamp.

ValueEffect
"none"All edges weighted equally (default)
"7_days"Edges lose half their weight every 7 days
"30_days"Edges lose half their weight every 30 days
"90_days"Edges lose half their weight every 90 days

When recency weighting is enabled, the weighted_score in each result reflects the decayed sum, while occurrences always reports the raw unweighted count.

1result = client.graph.detect_patterns(
2 user_id="alice",
3 seeds={
4 "node_labels": ["Decision"],
5 },
6 recency_weight="30_days",
7)
8
9for pattern in result.patterns:
10 print(f"{pattern.description}: {pattern.occurrences} occurrences, weighted score {pattern.weighted_score:.1f}")

Search Filters

Use search_filters to restrict which nodes and edges participate in pattern detection. This uses the same filter format as graph search.

1result = client.graph.detect_patterns(
2 user_id="alice",
3 seeds={
4 "node_labels": ["Decision"],
5 },
6 search_filters={
7 "node_labels": ["Decision", "Person"],
8 "edge_types": ["CHOSE", "REJECTED"],
9 "created_at": [[{">": "2025-01-01T00:00:00Z"}]],
10 },
11)

Response Structure

Each pattern in the response contains:

FieldTypeDescription
typestringPattern type: "relationship", "path", "co_occurrence", "hub", or "cluster"
descriptionstringHuman-readable description (e.g., "Person -[WORKS_AT]-> Company")
occurrencesintegerRaw count of this pattern in the graph (always unweighted)
weighted_scorefloatWeighted sum after recency decay (equals occurrences when recency_weight is "none")
node_labelsarrayNode labels involved in the pattern structure
edge_typesarrayEdge types involved in the pattern structure
edgesarrayResolved edges sorted by relevance score (only populated in query mode)

The top-level response also includes:

FieldTypeDescription
nodesarrayDeduplicated nodes referenced by pattern edges (only populated in query mode)
metadataobjectStatistics about the detection run

The metadata object contains:

FieldTypeDescription
nodes_analyzedintegerNumber of unique nodes analyzed
edges_analyzedintegerNumber of edges analyzed
elapsed_msintegerServer-side processing time in milliseconds

Patterns are sorted by weighted_score in descending order.