Kraken uses both cross validation and automatic holdout data when training your model to provide a reliable estimate of "real world" performance.
What is cross validation?
One of the biggest challenges in predictive analytics is to know how a trained model will perform on data that it has never seen before or - put another way - how well the model has learned true patterns versus having simply memorized the training data. There are several ways to approach this problem; cross validation is one of the most common. The practice of cross validation is to take a dataset and randomly sort it into multiple buckets called “folds”. Kraken uses five of these - which is also known as "five-fold cross validation" - for binary, multi-class, regression and time series models.
Let's look at an example of a binary classification model. Imagine splitting a dataset into five pieces at random. Cross validation would then test each fold against a model trained using the other four folds. In other words, each trained model is tested on a piece of the data that the model has never seen before; this is done in rotation for all of the segments of the data. The outcome of cross validation is a set of test metrics that gives a reasonable forecast of how a model trained on all of the data will do at predicting on a record it has never seen before.
Computers are exceptionally good at helping a math equation memorize the data so that the equation fits the historical data perfectly... cross validation is an effective technique to make sure that the model isn’t just memorizing, but is actually learning generalized patterns. The performance measure reported by five-fold cross validation is then the average of the values computed in the loop. This approach can be computationally expensive, but does not waste too much data, which is a major advantage in problems such as inverse inference where the number of samples is very small.
Until late November 2020, cross validation was the sole method that Kraken used to generate the model metrics during training, which reflect the overall accuracy of a model and are often referred to as "model score" or "model accuracy".
What is automatic holdout?
Kraken now includes an additional method to better evaluate the "real world" performance of binary, multi-class and regression models: automatic holdout. Before Kraken begins the model training process, it extracts 20% of your dataset to be used for final model evaluation in what's known as automatic holdout. (For binary and multi-class models, Kraken uses random stratified sampling based on the target column.) Kraken "holds" that data until after model training, at which time it is used to evaluate the performance of the model training. The benefit of the holdout data is that it's not seen by the model during training (unlike cross validation data) so it's ideal for validating the model training performance.
Where can I access the metrics generated from cross validation and automatic holdout?
From the Analysis overview screen, you can access Kraken model metrics by clicking on the Driver Influence card, the Fit card or the Correlations card and then clicking the Model Metrics button with the yellow submarine icon.
It's important to note that metrics displayed on the model metrics overview screen are based on the automatic holdout data and NOT on the cross validation training data.
From there, click on an algorithm/model name to get to the details. You will see two rows of metrics across the top, known as comparative model metrics: holdout metrics above, and cross validation training metrics below. Both values are presented so you can compare the two.
Do I need to do anything to use automatic holdout?
Nope! It’s enabled for all new binary, multi-class and regression models that you build in Kraken and everything happens, well, automatically.
What about models I built before automatic holdout was available in Kraken?
If your model was built before November 25, 2020, it is not utilizing automatic holdout. The model metrics displayed are based on the cross validation training data (since holdout metrics don't exist for your model). Additionally, if you click on an algorithm to dive into the model metrics details, you will see only one row of metrics across the top. At the top of the page, you will see a notification that your model requires retraining if you want to access the comparative model metrics that include automatic holdout metrics.
How do I retrain an existing model to use automatic holdout?
Step 1: From the analysis overview screen, click the "REFINE" option to the right of your analysis. | ![]() |
Step 2: Click the pink "ANALYZE" button in the upper right. (You do not need to add or modify any Data Pipeline steps before you analyze to retrain, but you're free to do so.) | ![]() |
Step 3: (Optional, but recommended) Verify that your model is now using automatic holdout by clicking into Model Metrics for your analysis, then clicking on an algorithm/model name to see the details. You will see two rows of model metrics across the top. | ![]() |
How do I get Kraken to apply predictions using my retrained model with automatic holdout enabled?
Step 1: From the analysis overview screen, click the "DEPLOY" option to the right of your analysis. | ![]() |
Step 2: Select the "Publish" action for the new version of your analysis. | ![]() |
Does automatic holdout work with Kraken’s automatic Hyperparameter Optimization (HPO) option?
Absolutely! If your company has purchased Kraken’s automatic Hyperparameter Optimization (HPO) option and you create a model with HPO enabled, Kraken will use both the optimization of hyperparameter values and automatic holdout for the most effective, accurate model results available from Kraken.