Loading

Data 360: Understanding Record Drops from Duplicate Primary Keys in Accelerated Data Streams

Veröffentlichungsdatum: Feb 3, 2026
Beschreibung

In BYOL data streams with acceleration enabled, Data 360 (Formerly Data Cloud) caches incoming data to support faster activations and querying. As part of this caching process, Data 360 enforces primary key uniqueness, ensuring that no duplicate records with the same primary key are stored.

If duplicate records are found in the incoming dataset (i.e., multiple records share the same primary key), only one record is retained. The rest are dropped during ingestion, which leads to a lower "Added" count compared to the "Processed" count.

This differs from Direct Access mode, where Data 360 does not cache or manage the data. Instead, it queries the data directly from the external source such as Snowflake or BigQuery without applying any primary key constraints. As a result, duplicates are not dropped in Direct Access.

Lösung

Before enabling acceleration, it’s important to validate that your source data has unique primary keys. This helps prevent unexpected data drops due to duplicate records being silently excluded during ingestion.

How to Validate Duplicate Primary Keys Values in the Source

Run the following representative SQL query against your source table to identify duplicate primary key values:

SELECT 
  primary_key_column, 
  COUNT(*) AS record_count
FROM 
  your_source_table
GROUP BY 
  primary_key_column
HAVING 
  COUNT(*) > 1;

What to Replace:

  • primary_key_column: Substitute this with the actual primary key column used in your stream.

  • your_source_table: Replace with the name of the source table being streamed into Data 360.

Note:

This is a generic query intended as a starting point. The exact SQL syntax may vary depending on your source platform (e.g., BigQuery, Snowflake, Redshift). Be sure to tailor the query to match your environment’s SQL dialect and schema.

This query helps identify keys that appear more than once in your source data. These duplicates may be automatically dropped by Data 360 when acceleration is enabled, as the platform enforces primary key uniqueness during the caching process.

This restriction does not apply to Direct Access mode, where data is queried live from the source and primary key constraints are not enforced.

Zusätzliche Ressourcen

Caching in Data Federation

Nummer des Knowledge-Artikels

005036922

 
Laden
Salesforce Help | Article