Depending on the source data and the IR ruleset configuration, users may encounter a warning during Identity Resolution job run that reads:
“Default IR are matching more than 50,000 profiles.”
This usually indicates that a specific value (such as an email address or phone number) is linked to an unusually high number of profiles. These large clusters can affect match quality, performance, and the accuracy of profile unification.
To understand which values are leading to these large clusters, you can write a query that joins the ssot__Individual__dlm table with the relevant DMO involved in your match rule (e.g., Contact Point Email, Phone, etc.). This allows you to identify high-frequency values and take corrective action — such as reviewing source system quality or adjusting your match rules.
If your Identity Resolution configuration uses the Contact Point Email DMO and matches on the Email Address field, use the following query to identify values with high match volumes:
SELECT COUNT(1), e.ssot__EmailAddress__c
FROM ssot__Individual__dlm i
JOIN ssot__ContactPointEmail__dlm e
ON i.ssot__Id__c = e.ssot__PartyId__c
GROUP BY e.ssot__EmailAddress__c
ORDER BY COUNT(1) DESC
This query will help you identify:
The most commonly repeated email addresses
The number of profiles associated with each email address
You can adapt this query based on the DMOs and fields defined in your match rules—such as Contact Point Phone, Party Identification, or other attributes relevant to your configuration.
In addition to querying raw data, you can further leverage Calculated Insights to monitor profile quality and consolidation rates at scale. These insights can help surface underlying patterns or anomalies in match behavior that may not be immediately visible through SQL alone.
For practical examples and step-by-step guidance, refer to this article:
Using Calculated Insights to Investigate Profile Quality and Consolidation Rates
Important Clarification: Indirect Grouping via Transitive Matching
Even if no single email address or identifier exceeds 50,000 matches, the issue may still arise due to indirect grouping. Here’s how:
Some identifiers (e.g., a shared email or phone number) can match 5,000 or above records.
These records may, in turn, match with other identifiers (like a common phone number or address).
This chaining effect forms a large cluster or “bucket” where all interconnected matches are grouped together.
Collectively, this merged group can exceed 50,000 profiles, even though no single match rule triggered it directly.
Transitive matching can also occur when fuzzy match rules are configured, particularly with medium or low fuzzy precision. For example: If you define two rules—one for fuzzy name and address matching, and another for email match—a vague name like "L" could fuzzily match "Luis" and "Lateefa" under medium precision. If that record also uses a generic email like "noemail@gmail.com", it becomes a bridge, linking unrelated profiles and contributing to a large transitive cluster.
Corrective Actions:
Once these records are identified, you can take corrective actions, such as:
Auditing data quality in source systems.
Ensuring that match rules do not rely on placeholder or default values, such as: "NULL", "N/A", "test@test.com", "unknown@email.com". These placeholder values often sneak into records during testing or data ingestion and can inadvertently link thousands of profiles together — even if each individual group seems small. These "mini-clusters" are then chained into larger match groups, collectively pushing the group size past the 50,000 profile threshold and triggering the warning.
Reviewing or refining your match rule logic.
Note: Running queries using the Query Editor will consume Data Queries credits. You can find more details on how this is billed under Data 360 Billable Usage Types.
004637133

We use three kinds of cookies on our websites: required, functional, and advertising. You can choose whether functional and advertising cookies apply. Click on the different cookie categories to find out more about each category and to change the default settings.
Privacy Statement
Required cookies are necessary for basic website functionality. Some examples include: session cookies needed to transmit the website, authentication cookies, and security cookies.
Functional cookies enhance functions, performance, and services on the website. Some examples include: cookies used to analyze site traffic, cookies used for market research, and cookies used to display advertising that is not directed to a particular individual.
Advertising cookies track activity across websites in order to understand a viewer’s interests, and direct them specific marketing. Some examples include: cookies used for remarketing, or interest-based advertising.