We can guess the outcome of a sport event by simply tossing a coin or relying on our own divination. Although there are more reliable and precise methods for giving how a game will end and who will be the winner of a match. For solving a problem like this, data mining can be used, which is an applied method nowadays in many fields for searching and verifying connections which had complex solutions before data mining or could only rely on human experience. Furthermore, it is used in sports not just for prediction of results but for example specifying the line-up of a team, or exploring the possibilities of any injuries based on the performance on trainings, or in helping coaches to explore unseen connections. All of these are established by the huge amount of data which is generated on every occasion of a game, because this is indispensable for creating a well functioning model.
In my thesis I implement a project which is suited for predicting the outcome of NBA matches. To achieve this, in the first half of this paper I will provide an overview of the main steps of data mining, and the right data preparation used in the process and handling its deficiencies in particular. Thereafter I continue with already existing usage of data mining in sports, and give a short review of basketball. What the decisive factor in the result of a basketball match is, is not quite obvious, so in the second half of the paper I get a suitable data set and create attributes which are able to help in making a data mining model that can give an estimation on the winner team.