Cross Validation Tab for Numeric Use Cases

To test a model’s ability to make predictions, Einstein Discovery uses k-fold cross-validation, a process that reduces sampling bias when validating a model. This tab summarizes the results of the cross-validation process for this model, as well as some of the underlying computational details.

Note Einstein Discovery stories are now models. We wish we could snap our fingers to update the name everywhere, but you can expect to see the previous name in a few places until we replace it.

Navigate to the Cross Validation Tab

In Performance, click Model Evaluation, then Cross Validation.

Cross Validation tab, showing 4-fold cross-validation results

Columns on the Cross Validation Tab

Column Name	Description
Metric Name	Name of the metric.
Training Set	Metrics for the set of observations in the CRM Analytics dataset that Einstein Discovery used to train the model.
Validation Set	Metrics for the set of observations in the CRM Analytics dataset that Einstein Discovery used to validate the predictions generated by the trained model.
Fold #1	Metrics for the first fold.
Fold #2	Metrics for the second fold.
Fold #3	Metrics for the third fold.
Fold #4	Metrics for the fourth fold.

Metrics on the Cross Validation Tab

The Cross Validation tab shows the following metrics for a model.

Metric Name	Description
Number of rows	Total number of observations. The meaning of a value varies per column. For the Training Set and Validation Set columns, the numbers are the same. This value represents the total number of observations in the entire dataset used in the creation of the model. For the Fold #1 through Fold #4 columns, this value represents how many observations fell in that fold (approximately 25% of the entire dataset). For the Average column, this value represents the average of the folds, which is the number of observations in the dataset divided by 4.
R²	R² measures the model's ability to explain variation in the outcome, which is an indicator of how predictive the model is. Range: 0 means that the model is not able to explain any variability in the outcome (random chance). 1 means that the model explains all of the variability (perfect model).
MSE	Mean Squared Error. Measures the average squared error of the model’s predictions. MSE computes the square difference between the observed (actual) outcome and the predicted values, and then averages them.
RMSE	Root Mean Squared Error. Represents the square root of MSE. RMSE measures the difference between the values predicted by the model and the observed (actual) values. You can think of this value as the “standard deviation of errors.”
MAE	Mean Absolute Error. Measures the absolute difference between the actual value and the prediction. All differences are weighted equally in this average, which means that it is not as sensitive to outliers as MSE.
RMSLE	Root-Mean-Squared Logarithmic Error. Compare with RMSE: RMSLE is less sensitive to outliers than RMSE. RMSE penalizes only for the overall scale of the error, while RMSLE penalizes for the scale of the error relative to the actual value. RMSLE penalizes under-predictions more than over-predictions. This factor makes RMSLE more useful than RMSE when you want to err on the side of over-estimating. For example, to optimize stock levels of non-perishables, you prefer a surplus over potentially running out of stock.
Residual Deviance	Measures how well the developed model performs on your dataset. As with Null Deviance, lower values mean better fit. Thus, if Residual Deviance is low, the model fits the data well. The larger the gap between Residual Deviance and Null Deviance, the better the model is doing at picking up the complexities in the data.
Mean Residual Deviance	Measures how well the model performs on your dataset. Lower values indicate a better fit.
Null Deviance	Measures how well a simple model performs on your dataset. Lower values mean a better fit. Thus, if Null Deviance is low, your data is not particularly complex.
Null Degrees of Freedom	Represents the Chi-Square distributions of Null Deviance.
Residual Degrees of Freedom	Represents the Chi-Square distributions of Residual Deviance.

Model Validation Methodology

Einstein Discovery conducts k-fold cross-validation (k=4) on your model. This process involves the following steps:

Randomly divide all the observations in the CRM Analytics dataset into four separate partitions of equal size.
Conduct four test passes (folds) in which three of the partitions serve as the training set and one partition serves as the test set.
Note After completing the four test passes, each partition has served one time as the validation set and three times as part of the training set.
For each fold, compile model metrics.
Take the average of the four folds for an overall score.

Cross Validation Tab for Numeric Use Cases

Navigate to the Cross Validation Tab

Columns on the Cross Validation Tab

Metrics on the Cross Validation Tab

Model Validation Methodology

See Also

General Information

Required Cookies

Functional Cookies

Advertising Cookies

General Information

Required Cookies

Functional Cookies

Advertising Cookies

Cookie List

Product Area

Feature Impact

Edition

Experience

Cross Validation Tab for Numeric Use Cases

Navigate to the Cross Validation Tab

Columns on the Cross Validation Tab

Metrics on the Cross Validation Tab

Model Validation Methodology

See Also