Scorecard creation for ensemble decision tree models
With the role of data mining getting bigger and bigger in corporate decision making the need arises to compare the various available data mining methods and to comprehend and reason about their results.
Scorecards are simple tables or graphs reflecting the effects of input variables on the model score. They provide a simple and quick way to evaluate an observation without difficult computations and can help reasoning about the models by shedding light on their inner workings.
This work begins with a short introduction to the data mining process and provides necessary background information about scorecards and common data mining methods, putting emphasis on ensemble decision tree models. I present some practical data mining tools.
I introduce and implement algorithms of scorecard creation and visualization for additive decision tree models. These algorithms are the two-level algorithm, that is a real scoring technique and the partial dependency method, which provides valuable insight into the models.
I present regression and classification problems and provide ensemble decision tree solutions to them. With the help of scorecards, I make statements about the fitted models and the effect of certain attributes on the target variable.