You are here:
Data 360 Auto Mapping Model Card
The Auto Mapping model automatically maps data lake object (DLO) source fields to data model object (DMO) target fields. The model reduces the amount of manual labor and time required to configure new data streams in Data 360.
- Model Details
The Auto Mapping model automatically maps data lake object (DLO) source fields to data model object (DMO) target fields. The model reduces the amount of manual labor and time required to configure new data streams in Data 360. - Intended Use
The Einstein Content Selection and Testing model in Marketing Cloud Engagement is intended for these use cases. - Relevant Factors
These factors are associated with the Data 360 Auto Mapping model. - Metrics
Data 360 evaluates and monitors model performance metrics to maintain and improve the quality of the model. - Ethical Considerations
The model is based on metadata and doesn’t include demographic data. The headers referenced are standardized within the Salesforce Platform or customized per organization. We recommend not using specific values for gender, race, or other demographic data as your field label.
Model Details
The Auto Mapping model automatically maps data lake object (DLO) source fields to data model object (DMO) target fields. The model reduces the amount of manual labor and time required to configure new data streams in Data 360.
Person or Organization
Salesforce Data 360
Model Date and Version
- Pre-trained model, based on MPNet is incorporated into the Auto Mapping mechanism.
- Anonymous metadata collection begins in June 2023 (Summer ’23).
- Separate instances of the pre-trained model will be introduced per org in Winter ’24.
- Minor changes can occur throughout a release.
- Major changes can occur and are communicated via release notes.
Model Type
The Auto Mapping model is a multi-class semantic textual similarity (STS) model that assigns a score based on the similarity of two texts. The model is pretrained on open datasets and provides a foundation to fine-tune your data to use within your environment. The model provides auto mapping capabilities even if you have little to no data. The downstream models are built to use only your individual data. No global models are trained using customers’ data.
General Information
The solution is based on the sentence-transformers pretrained model “all-mpnet-base-v2”, which is fine-tuned on over 1 billion sentence pairs. The pretrained model is available on HuggingFace and is incorporated into the Auto Mapping solution to allow for customization for each customer based on their data and needs.
Licenses
Auto Mapping is available to Salesforce customers that have a Data Cloud license.
Questions or Comments
Send questions or comments about the model to support@salesforce.com.
Intended Use
The Einstein Content Selection and Testing model in Marketing Cloud Engagement is intended for these use cases.
Primary Intended Uses
You can use the Auto Mapping model to map metadata. The primary use cases are platform admins mapping DLOs to the data model and solution architects implementing Data 360 solutions. Anything other than this use case is out of scope and not recommended.
- Platform admins mapping DLOs to the data model
- Solution Architects implementing Data 360 solutions
Out-of-Scope Use Cases
- Anything other than the primary use case is out of scope and not recommended.
Relevant Factors
These factors are associated with the Data 360 Auto Mapping model.
Model Inputs
- DLO field header names and data types (metadata)
- DMO field header names and data types (both standard DMOs and custom DMOs)
String Similarity and Semantic Similarity
- The Auto Mapping model is based on finding string similarity between fields and semantic similarity. STS assigns a score on the similarity of two texts.
Previous Mappings
- The model uses usage data, such as previous mappings performed by a specific tenant. The previous mappings are used as possible suggested mappings.
- All the models are single-customer models without any cross-customers or global models. Mappings aren’t shared between customers.
Metrics
Data 360 evaluates and monitors model performance metrics to maintain and improve the quality of the model.
Model Performance Measures
Aggregated model performance metrics are gathered to monitor, maintain, and improve the quality of the model. Model performance is based on the number of automapping recommendations that you’ve approved, rejected, or modified.
- Recall = number of mapping recommendations approved by the user / final number of correct mappings
- Precision = number of mapping recommendations approved by the user / total number of mappings recommendations
- Accuracy = 1 – (modified + added) / (total number of mappings recommendations) = Recall*(2–1/Precision)
All metrics are aggregated and anonymized. The mapping recommendations are only shown above a certain threshold. The exact threshold won’t be determined until Winter ’24.
Ethical Considerations
The model is based on metadata and doesn’t include demographic data. The headers referenced are standardized within the Salesforce Platform or customized per organization. We recommend not using specific values for gender, race, or other demographic data as your field label.
Caveats and Recommendations
One of the main challenges is understanding and modeling the elements within a variable context. The semantic similarity of DLO and DMO fields depends on context, resulting in ambiguity.
Customer Data Privacy
The Data 360 model complies with all relevant regulations and ensures customer privacy by encrypting data using OrgService. The model allows only data scientist access to customer data for debugging, validation, and verification purposes in a secured environment without any data extraction.

