The model is built, or “trained”, on a training dataset that is a subset of the original dataset you select. Kraken automatically splits your dataset randomly and performs five fold cross validation.
Predictions are made with each row of data in the test dataset and compared to the actual result, producing the accuracy measures with which the models are scored.
It may seem slightly counterintuitive that the model can’t get it completely right against the historical data – after all, those events already happened. All that really means is that the model is not predicting with 100% accuracy, so some of the “predictions” (on a historical data point) don’t match what actually happened. This is not necessarily a bad thing; in fact, any model that predicts with 100% accuracy against the test dataset should at least be scrutinized further to see if overfitting or other errors may be occurring.