As we saw in the previous chapter, for our purposes, machine learning will be reduced to aggregating a table of historic data, picking a target column representing what we are interested in predicting, and feeding that table to a machine learning algorithm. The machine learning algorithm will create a function that predicts the value for the target column using all the other feature columns as inputs. The Qlik AutoML platform will look at the data in the target column and decide, based on whether the target column has categorical or numerical data, if we are dealing with a regression or a classification problem. Then it will select all the appropriate machine learning algorithms from our library, train all of them on the input data set, and select the best one. Later, we will see that there are different algorithms that work better with different types of data sets. In later chapters, we will also explain how we select the “best” algorithm for the problem at hand. For now, our concern is just asking a business question and aggregating a data set that can be used to answer that question.
We will explain how to ask business questions and prepare a dataset for those questions through a couple of examples.