Data Explorer lets you visually explore and gain insights about your model training data as you prepare it for analysis in Kraken. Based on Microsoft SandDance, Data Explorer provides ease of use for data visualizations, pattern identification and trends in your training data. By creating and using easy-to-consume views, the insights from Data Explorer help you build modeling scenarios based on evidence, identify outliers, test modeling hypotheses and dig deeper into surface explanations.
Accessing Data Explorer: After selecting your training dataset, click the “three bars” bar chart icon in the upper right, near the pink “Analyze” button. Use the tabs and controls on the left to select and customize the visualization(s) of your choice.
Chart tab: This is the default tab when accessing Data Explorer. Use the buttons on the left to select the type of visualization you would like. Use the Column Mapping options to choose:
Axis data
Data bin size
Data colors
Data sorting
Data facet
Chart color palette tab: Includes preset color schemes and color binning options.
Data browser tab: Browse data values for every column, one row at a time.
Select-by-search tab: Choose data to include in your visualizations, and group that data as desired.
Snapshot tab: Create snapshots of your visualizations for easy reference of different views while you are in Data Explorer.
Options tab: Includes options for opacity, text display, chart behavior and more.
Important Notes about Data Explorer
You can show or hide Data Pipeline while in Data Explorer. You can show or hide the Data Pipeline window by clicking the “curvy arrows” button to the left of the “Analyze” button in the upper right of the Kraken screen. Any filters or other changes you make to your Data Pipeline will be included in any visualizations in Data Explorer.
Data Explorer visualizations and snapshots are not static. If you leave Data Explorer by switching to Data View or Schema View, any work performed in Data Explorer will be lost. If you are planning to explore your training data in detail, it is recommended that you remain in Data Explorer to avoid losing any changes you make to your visualizations.
Data Explorer visualizations do not impact your Data Pipeline actions. If you apply filters or other visualizations in Data Explorer, those changes do not create or modify any Data Pipeline actions.