Nowadays one of the most dynamically developing research field is the data mining. Several important tasks in telecommunications, business and finance are completely based on this new science field. The importance of data mining has been discovered recently in sports too, and its spreading is unstoppable, especially in team sports. Data mining techniques help us find hidden connections between data and help us understand and process loads of data which are recorded in sports. The results are used in sports not just for prediction but for example for specifying the line-up of a team – to help the coaches – , or for exploring the possibilities of any injuries – helping doctors –, or in managing the whole team. Betting offices use the results too, to create the right odds.
In my thesis I implement a project, which is suited for analyzing the roster depth of the American professional basketball teams. In the long term, such a project could be used to help coaches and GM (General Managers) plan the roster of the teams. In the first half of this thesis I will provide an overview of the main steps of data mining. Afterwards, I continue with presenting the already existing usage of data mining in sports, and give a short review of basketball. In the second half of my thesis, after the data collection and the creation of the new variables, I will create a database, which can be used for the data mining models to produce useful information about the team rosters.