As we roll into August, it's a perfect time to release some cool, fresh Kraken goodness. For enhanced model training insights, we've introduced a Feature Importance card group that includes the newly-renamed Permutation Importance (formerly known as Driver Influence) chart as well an all-new SHAP Importance chart. We've also made it easier than ever to create predictions from your training data, and Kraken now provides a lot more details on how your training data will be processed - and what algorithms will be used - during Analysis creation.
New Feature Importance card group
After running an Analysis, you now have access to a new Feature Importance card group that provides additional insight beyond what the former Driver Influence card provided. The charts are accessible in both thumbnail view on the Project Overview screen as well as full screen view by clicking the appropriate card name:
- Permutation Importance (formerly known as Driver Influence), which indicates how much a feature impacts the prediction. It is a measure of how sensitive the prediction is to changes in the value of that feature. The higher the sensitivity, the greater the impact.
- SHAP Importance, which is an aggregation of SHAP values (formerly known as Prediction Influencers) on your training data represents how a feature influences the prediction of a single row relative to the other features in that row and to the average outcome in the dataset. When aggregated, SHAP importance provides a general indication of relative influence among the features in the training dataset.
NOTE: SHAP Importance is calculated when you create a new Analysis or refine an existing Analysis. For your current Analyses, SHAP Importance will not be available until you refine/retrain your Analysis. Kraken will visually indicate that you need to retrain your model (by refining your Analysis) so that SHAP Importance can be generated on your new Analysis version.
TIP: After you retrain an existing Analysis to generate SHAP Importance values and then leave that Analysis, the next time you access the Analysis you may not see the SHAP Importance values. Ensure that you select and publish a version of your Analysis for which SHAP Importance values were created (it will be the most recent version if you only retrained once) via the DEPLOY tab from the Project overview screen.
One-click predictions from training data
After running an Analysis on a dataset with no nulls in your target column, you can now quickly create predictions using the same dataset you used for predictions with a single click. Kraken will simply ignore the values in the target column, treating your training data as an apply dataset.
Additional model training data and Pipeline insights
Kraken now provides additional information as you're preparing your Analysis:
- Kraken will visually indicate when a feature is using Impact encoding (which occurs when there are 14 or more values in your categorical feature).
- Kraken will visually indicate how it is imputing null values (on columns with less than 50% nulls).
- Kraken will visually indicate if one or more algorithms will be excluded from your Analysis, along with the reason for the exclusion.
- In the Preprocessing Steps section of the Pipeline, Kraken now provides informational icons that, when hovered over, provide more detail on each preprocessing step.
Easier access to creating additional Analyses and Predictions, and sharing options
We have moved the "+ NEW" button to the top of the page for better visibility, and we revamped the available options in the "three dot" menu.
New visual Prediction Sync indicator
There is a new visual Prediction Sync indicator in the Prediction section of your Analysis that is green when Prediction Sync is enabled, and grey when it is disabled. NOTE: This is not a button that controls sync; it is simply a visual indication of Prediction Sync status. To enable, modify or disable Prediction Sync, click the SYNC text on the right hand side of the screen.
We've also implemented lots of bug fixes and performance improvements. Some noteworthy bugs resolved in this release include:
- Occasionally, one or more rows of Prediction data that should contain SHAP Importance "K_" values (formerly known as Prediction Influencers) would not contain them.
- Occasionally, after deleting a dataset in a provider, when refreshing the dataset list in Kraken, new datasets added to that provider after you deleted the dataset would fail to appear in the list.
- In Time Series Analyses, Kraken would occasionally display an error without any details.
Reminder: If Kraken is misbehaving, check KIKI first.
All Kraken users have access to KIKI, a.k.a. "Kraken Important Known Issues". If you're experiencing an issue with Kraken, KIKI is your first stop to determine if we're aware of the issue. Where possible, we provide suggested workarounds while we work to eliminate issues. KIKI will be updated as we eliminate issues through Kraken product updates, as well as when new issues are identified.
Reminder: SONAR© Guide content is just a click away!
As a Kraken user, you have access to SONAR© Guide, the proven training methodology from Big Squid and your key to learning everything you need to know to use Kraken effectively and become a citizen data scientist hero in the process. A link to the SONAR© Guide content library is accessible whenever you access Kraken help screens; just look for the SONAR© Guide logo on the right-hand side. It's also available in the SONAR© Guide on the Big Squid Support site.