Data 360 Auto Mapping Model Card

You are here:

Data 360 Auto Mapping Model Card

The Auto Mapping model automatically maps data lake object (DLO) source fields to data model object (DMO) target fields. The model reduces the amount of manual labor and time required to configure new data streams in Data 360.

Model Details
The Auto Mapping model automatically maps data lake object (DLO) source fields to data model object (DMO) target fields. The model reduces the amount of manual labor and time required to configure new data streams in Data 360.
Intended Use
The Einstein Content Selection and Testing model in Marketing Cloud Engagement is intended for these use cases.
Relevant Factors
These factors are associated with the Data 360 Auto Mapping model.
Metrics
Data 360 evaluates and monitors model performance metrics to maintain and improve the quality of the model.
Ethical Considerations
The model is based on metadata and doesn’t include demographic data. The headers referenced are standardized within the Salesforce Platform or customized per organization. We recommend not using specific values for gender, race, or other demographic data as your field label.

Model Details

Person or Organization

Salesforce Data 360

Model Date and Version

Pre-trained model, based on MPNet is incorporated into the Auto Mapping mechanism.
Anonymous metadata collection begins in June 2023 (Summer ’23).
Separate instances of the pre-trained model will be introduced per org in Winter ’24.
Minor changes can occur throughout a release.
Major changes can occur and are communicated via release notes.

Model Type

The Auto Mapping model is a multi-class semantic textual similarity (STS) model that assigns a score based on the similarity of two texts. The model is pretrained on open datasets and provides a foundation to fine-tune your data to use within your environment. The model provides auto mapping capabilities even if you have little to no data. The downstream models are built to use only your individual data. No global models are trained using customers’ data.

General Information

The solution is based on the sentence-transformers pretrained model “all-mpnet-base-v2”, which is fine-tuned on over 1 billion sentence pairs. The pretrained model is available on HuggingFace and is incorporated into the Auto Mapping solution to allow for customization for each customer based on their data and needs.

Licenses

Auto Mapping is available to Salesforce customers that have a Data Cloud license.

Questions or Comments

Send questions or comments about the model to support@salesforce.com.

Intended Use

The Einstein Content Selection and Testing model in Marketing Cloud Engagement is intended for these use cases.

Primary Intended Uses

You can use the Auto Mapping model to map metadata. The primary use cases are platform admins mapping DLOs to the data model and solution architects implementing Data 360 solutions. Anything other than this use case is out of scope and not recommended.

Platform admins mapping DLOs to the data model
Solution Architects implementing Data 360 solutions

Out-of-Scope Use Cases

Anything other than the primary use case is out of scope and not recommended.

Relevant Factors

These factors are associated with the Data 360 Auto Mapping model.

Model Inputs

DLO field header names and data types (metadata)
DMO field header names and data types (both standard DMOs and custom DMOs)

String Similarity and Semantic Similarity

The Auto Mapping model is based on finding string similarity between fields and semantic similarity. STS assigns a score on the similarity of two texts.

Previous Mappings

The model uses usage data, such as previous mappings performed by a specific tenant. The previous mappings are used as possible suggested mappings.
All the models are single-customer models without any cross-customers or global models. Mappings aren’t shared between customers.

Metrics

Data 360 evaluates and monitors model performance metrics to maintain and improve the quality of the model.

Model Performance Measures

Aggregated model performance metrics are gathered to monitor, maintain, and improve the quality of the model. Model performance is based on the number of automapping recommendations that you’ve approved, rejected, or modified.

Recall = number of mapping recommendations approved by the user / final number of correct mappings
Precision = number of mapping recommendations approved by the user / total number of mappings recommendations
Accuracy = 1 – (modified + added) / (total number of mappings recommendations) = Recall*(2–1/Precision)

All metrics are aggregated and anonymized. The mapping recommendations are only shown above a certain threshold. The exact threshold won’t be determined until Winter ’24.

Ethical Considerations

The model is based on metadata and doesn’t include demographic data. The headers referenced are standardized within the Salesforce Platform or customized per organization. We recommend not using specific values for gender, race, or other demographic data as your field label.

Caveats and Recommendations

One of the main challenges is understanding and modeling the elements within a variable context. The semantic similarity of DLO and DMO fields depends on context, resulting in ambiguity.

Customer Data Privacy

The Data 360 model complies with all relevant regulations and ensures customer privacy by encrypting data using OrgService. The model allows only data scientist access to customer data for debugging, validation, and verification purposes in a secured environment without any data extraction.

Did this article solve your issue?

Let us know so we can improve!

Data 360 Auto Mapping Model Card

Model Details

Person or Organization

Model Date and Version

Model Type

General Information

Licenses

Questions or Comments

Intended Use

Primary Intended Uses

Out-of-Scope Use Cases

Relevant Factors

Model Inputs

String Similarity and Semantic Similarity

Previous Mappings

Metrics

Model Performance Measures

Ethical Considerations

Caveats and Recommendations

Customer Data Privacy

General Information

Required Cookies

Functional Cookies

Advertising Cookies

General Information

Required Cookies

Functional Cookies

Advertising Cookies

Cookie List

Product Area

Feature Impact

Edition

Experience

Data 360 Auto Mapping Model Card

Model Details

Person or Organization

Model Date and Version

Model Type

General Information

Licenses

Questions or Comments

Intended Use

Primary Intended Uses

Out-of-Scope Use Cases

Relevant Factors

Model Inputs

String Similarity and Semantic Similarity

Previous Mappings

Metrics

Model Performance Measures

Ethical Considerations

Caveats and Recommendations

Customer Data Privacy