Loading
Scheduled maintenance for Salesforce HelpRead More
CRM Analytics
Table of Contents
Select Filters

          No results
          No results
          Here are some search tips

          Check the spelling of your keywords.
          Use more general search terms.
          Select fewer filters to broaden your search.

          Search all of Salesforce Help
          Cross-Validation Tab for Binary Classification Use Cases

          Cross-Validation Tab for Binary Classification Use Cases

          To test a model’s ability to make predictions, Einstein Discovery uses k-fold cross-validation, a process that reduces sampling bias when validating a model. This tab summarizes the results of the cross-validation process for this model, as well as some of the underlying computational details.

          Note
          Note Einstein Discovery stories are now models. We wish we could snap our fingers to update the name everywhere, but you can expect to see the previous name in a few places until we replace it.

          Navigate to the Cross Validation Tab

          In Performance, click Model Evaluation, then Cross Validation.

          Cross Validation tab, showing 4-fold cross-validation results

          Columns on the Cross Validation Tab

          The following table describes the columns in the Cross Validation tab.

          Column NameDescription
          Metric Name Name of the metric.
          Training Set Metrics for the set of observations in the CRM Analytics dataset that Einstein Discovery uses to train the model.
          Validation Set Metrics for the set of observations in the CRM Analytics dataset that Einstein Discovery uses to validate the predictions generated by the trained model.
          Fold #1 Metrics for the first fold.
          Fold #2 Metrics for the second fold.
          Fold #3 Metrics for the third fold.
          Fold #4 Metrics for the fourth fold.

          Metrics on the Cross Validation Tab

          The following table describes the metrics in the Cross Validation tab.

          Metric NameDescription
          Number of Rows

          Total number of observations. The meaning of a value varies per column.

          • For the Training Set and Validation Set columns, the numbers are the same. This value represents the total number of observations in the entire dataset used in the creation of the model.
          • For the Fold #1 through Fold #4 columns, this value represents how many observations fell in that fold (approximately 25% of the entire dataset).
          AUC

          Area Under the Curve. Represents the rate of correct classification by a logistic model. AUC is the most reasonable metric to use for classification use cases.

          Range:

          • 0.5 is randomly guessing.
          • 1.0 means that the model correctly classified the data 100% of the time (which is suspicious).
          GINI

          GINI Index. Quantifies how close the obtained model performs to a theoretically best possible model.

          Range:

          • 0 means that the model is no better than random.
          • 1 means that the predictions match observations exactly (view with skepticism).
          Log Loss Logarithmic Loss. Measures model performance on a scale of 0–1, where 0 represents a perfect model (the predicted probability correctly matches actual observations 100%). The less the predicted probability correctly matches actual observations (lower performance), the higher the log loss. Log loss considers uncertainty in model performance.
          Mean Per Class Error Measures how often the predictions are wrong. Lower values mean the predictions are wrong less often, and therefore the model is better at making predictions.

          Model Validation Methodology

          Einstein Discovery conducts k-fold cross-validation (k=4) on your model. This process involves the following steps:

          • Randomly divide all the observations in the CRM Analytics dataset into four separate partitions of equal size.
          • Conduct four test passes (folds) in which three of the partitions serve as the training set and one partition serves as the test set.
            Note
            Note After completing the four test passes, each partition has served one time as the validation set and three times as part of the training set.
          • For each fold, compile model metrics.
          • Take the average of the four folds for an overall score.
           
          Loading
          Salesforce Help | Article