Supervised Classification Learning Methods with Extended Confidence Metric

OData support
Gáspár Csaba
Department of Telecommunications and Media Informatics

Nowadays it’s much easier to collect and store huge amount of data owing to the evolving technology. There is much more information in that data that we can think at first thought. An independent discipline has been created to extract that information which is called data mining. Its goal is to find non-trivial and useful coherences with various learning algorithms. The models made by these algorithms are not perfect, sometimes inaccurate, if the data changes they could be obsolete. These problems make harder to use them.

In this thesis I present some related research fields for these problems and I also present four additional methods, made by myself, which can help to increase the models accuracy and in some cases they can recognize when a model gives a false prediction.

Two from the four methods is using an operator whose role is to supervise the decisions made by a model at those entities which were selected by the methods.

The other two methods can be used without an operator, they can select those entities where the original model has made a wrong decision in prediction and they are also able correct them automatically.

I’ve implemented these methods in SAS environment and I’ve also used RapidMiner for some tasks. To compare them, I’ve prepared five data sets for benchmarks, they are from various fields with their own properties. They will help us to see how the methods can perform on real life data.

At the end of the thesis I’ve compared the results of these methods, it was revealed that not only the same method has achieved the best result, but depending on the dataset, different results have been achieved. That means every method can be used in real tasks.


Please sign in to download the files of this thesis.