You are here:
Normalize Data with UNION Use Case
Normalize data ingested into Data 360 with streaming data transforms.
Let’s look at an example. The source data lake object CRMContacts contains up to three phone numbers for each customer.
| CustomerId | Name | MobilePhone | HomePhone | WorkPhone |
|---|---|---|---|---|
| C1 | John | 408-512-2234 | 650-342-4612 | |
| C2 | Peter | 310-453-3421 | 915-870-9900 |
If the data lake object CRMContacts is directly mapped to the data model object Contact Point Phone, only one contact point—MobilePhone or HomePhone or WorkPhone—can be mapped. As a result, only one of the three phone numbers is available in product areas that consume the data model object, such as calculated insights, identity resolution, and segmentation.
Here’s how to use a streaming data transform to make all three phone numbers available.
SELECT CONCAT(CRMContacts.CustomerId, "_Mobile") as PhoneId,
CRMContacts.CustomerId as CustomerId,
CRMContacts.MobilePhone as PhoneNumber,
"Mobile" as PhoneType
FROM CRMContacts
WHERE ISNOTNULL(CRMContacts.MobilePhone) AND CRMContacts.MobilePhone <> ""
UNION
SELECT CONCAT(CRMContacts.CustomerId, "_Home") as PhoneId,
CRMContacts.CustomerId as CustomerId, CRMContacts.HomePhone as PhoneNumber,
"Home" as PhoneType
FROM CRMContacts
WHERE ISNOTNULL(CRMContacts.HomePhone) AND CRMContacts.HomePhone <> ""
UNION
SELECT CONCAT(CRMContacts.CustomerId, "_Work") as PhoneId,
CRMContacts.CustomerId as CustomerId,
CRMContacts.WorkPhone as PhoneNumber,
"Work" as PhoneType
FROM CRMContacts
WHERE ISNOTNULL(CRMContacts.WorkPhone) AND CRMContacts.WorkPhone <> ""
After the data transform runs, the target DLO contains one record for each type of phone number.
| PhoneId | CustomerId | PhoneNumber | PhoneType |
|---|---|---|---|
| C1_Mobile | C1 | 408-512-2234 | Mobile |
| C1_Work | C1 | 650-342-4612 | Work |
| C2_Home | C2 | 310-453-3421 | Home |
| C2_Work | C2 | 915-870-9900 | Work |
Then, you can map the target Phones DLO to the Contact Point Phone DMO. The Contact Point Phone DMO now contains all three phone numbers for each customer.
To use the UNION operator in a streaming data transform,
each SELECT statement must:
- Reference the same source data lake object. In the example above, all three
SELECTstatements use the same source data lake object CRMContacts. - Map to the same fields for the target DLO. Each
SELECTstatement must have the same schema. In our example, all threeSELECTstatements map to the same set of fields:PhoneId,CustomerId,PhoneNumber, andPhoneType. - Generate a unique value for the primary key field. If non-unique values are generated for the
primary key field, one of the non-unique values is selected randomly. In our example, the
CONCAT()function generates unique values for the primary key field on the target DLO. - Have a unique field mapped to a primary key, or a set of fields mapped to a primary key in an expression.
UNION can be used with a maximum of five SELECT statements.

