Visualization of LogitBoost classifier

Nagy István
Department of Telecommunications and Media Informatics

The modern man can get in touch in many ways with the enormous amount of information flow that sweeps the world. You can encounter it on TV, on the radio, in different printed forms or on the nowadays more and more easily accessible websites. The effort to filter and interpret the relevant information more easily and more effectively from the data available to us has fundamentally defined the development of computer science.

This resulted in an autonomous scientific field called information visualization (IV) which includes the methods and techniques of the graphic representation of the data with the help of the computer. Another interdisciplinary field, the data mining (DM) has started to develop almost simultaneously, which occasionally seeks patterns in vast data sets and filters the useful information through statistic methods or the use of other algorithms. These two scientific fields are more and more interlaced nowadays, thus creating the grounds for the visual data analysis, which is a branch of IV.

My thesis has got a dual goal: to make understand the importance of IV and to give an insight on how the application of IV can help in better understanding the results of the DM.

This paper will on the one hand provide an insight in the currently used solutions of IV, analyze its advantages and disadvantages, and compares the regular methods of the analysis and those supported by IV. On the other hand my paper gives an example of the possible cooperation of the two fields that is presented through an application, which supports visual data analysis and was created by me. The program is a new classifying algorithm displaying the run and results of the LogitBoost using an example data set, while it provides the possibility to ease the difficulty of interpreting the algorithm.

In this paper I will give details on the steps of development and how the used technologies can cooperate. Apart from the analysis of the results I spotlight those problems that appeared during my work from the designing all the way to the implementation. Finally I will summarize the additional hidden possibilities of my application.


