You are here:
Detect Sentiment Model Card
In CRM Analytics, the Detect Sentiment transformation uses the model to determine whether the sentiment of text is positive, negative, or neutral. Use this model card to better understand the model, how it’s trained, its capabilities, its intended use, and its limitations.
Detect Sentiment Model Card
Get basic information about the Detect Sentiment model. This table provides details about the current model.
| Detail | Description |
|---|---|
| Model Developer | CRM Analytics |
| Model Date | August 12, 2020 |
| Model Version | 4 |
| Model Type | Sentiment Analysis |
| Model Input | Text (dimension) column |
| Model Output | For each field in a text (dimension) column of a Data Prep recipe, the model returns the sentiment: positive, negative, or neutral. The model returns null if the input is null and returns neutral if the input is an empty string (''). |
| License | Detect Sentiment is available to customers with any CRM Analytics user license. |
Minor changes to the model can occur throughout the release. Major changes can occur and are communicated via Release Notes.
Model Training
We trained the Detect Sentiment model based on the GloVe dictionary and training data. To learn sentiments of text, we used the GloVe dictionary to convert text from the training data to word vectors, and then identified the sentiment based on the rating associated with each comment in the training data.
We used the following process to train the model.
- Gather the training data. The training data consists of product review comments and their ratings, like 4 stars out of 5.
- Use a preprocess to massage the training data into the format required by GloVe.
- Run the training data through GloVe to convert it to vectors. We used the Wikipedia 2014 + English Gigaword Fifth Edition GloVe dictionary to convert each string to a vector.
- Pass the vectors and their associated ratings through a Multilayer Perceptron (MLP) neural network model to train the model to predict the sentiment of text.
For more information about the GloVe dictionary, see this website.
The GloVe dictionary is made available under the Public Domain Dedication and License v1.0 whose full text can be found at: http://opendatacommons.org/licenses/pddl/1.0/.
Use Cases
Understand the primary intended and out-of-scope use cases associated with the Detect Sentiment model.
The primary use case is to detect the sentiment of unstructured text passed to the Detect Sentiment transformation in a Data Prep recipe. Unstructured text can include customer surveys, product and service feedback, and unstructured communication, such as chat messages, text messages, email correspondence, and social media posts.
These use cases are out of scope:
- The Detect Sentiment transformation can’t be invoked outside of Data Prep recipes.
- The input text can’t bypass Salesforce data centers. The data must be available in Salesforce to be evaluated by the Detect Sentiment transformation.
- Detect Sentiment doesn’t support non-English input text. It processes non-English text as English and ignores images, including emoji.
Factors
After considering the deep-learning and lexicon-based approaches for sentiment analysis, we decided to use a hybrid of both approaches to mitigate bias in the sentiment model.
We chose to train the model using publicly available, labeled product review datasets over other datasets because of its relevance to business use cases. Other publicly available, labeled datasets, such as movie reviews or social media posts, are generally not as relevant and can hamper accuracy. Because the sentiment model is trained on specific sets of data, the accuracy of the sentiment can vary based on how similar your data is to the training data. Also, because the training data currently is in English, the Detect Sentiment model supports only English text at this time.
To learn how well the model works with different domains of data, check out the accuracies and F1 scores. When we evaluated the model based on product reviews and customer service survey responses, the model achieved sentiment accuracy between 69.09%-83.12% and an F1 score from 0.34 through 0.9. When we performed biased testing, the model achieved 92.31% accuracy and an F1 score of 0.9. If you’re curious why the results vary, it’s because each data domain can have unique terms that weren’t used to train the Detect Sentiment model. As a result, the model’s accuracy can vary based on the domain of the data.
The model is trained using 80% of openly available product review data, which consists of about 100,000 ratings and reviews. The sentiments of the training data are determined based on the ratings. For example, highly rated comments are considered positive. To evaluate the model, we compared the generated sentiments of the remaining 20% of the product reviews against their actual ratings. We also evaluated the model using proprietary survey responses.
Ethical Considerations
We have taken additional measures and precautions to correct bias against terms representing gender identity, sexual orientation, religious affiliation, and race. Eleven identity terms were included in this mitigation (atheist, queer, gay, lesbian, homosexual, feminist, black, white, heterosexual, islam, muslim, and bisexual). Using synthetic data, we trained the model to always predict neutral sentiments for these terms. Although text surrounding these identity terms can change the sentiment, these terms alone have a neutral sentiment. With the introduction of mostly handcrafted, high-quality data, we were able to reduce bias for the specified terms to a great extent without harming model performance. We recognize that the list of identity terms included in this mitigation is not exhaustive, and that future work must be undertaken to expand the list to be more inclusive of a broader range of stakeholders.
The model typically generates the correct sentiments, but the results can vary based on the data. For example, the model doesn’t accurately process new language, slang, or sarcasm. It also returns unexpected results for text without sentiment, such as IDs, nouns, addresses, and alphanumeric values. Because the accuracy can vary, consider making decisions based on trends in sentiment scores over time rather than on scores at a particular moment in time.
For more details about the Detect Sentiment model card, check out the following resources.

