Customer complaint classification using statistical data analysis guided by case-based hypotheses

OData support
Dr. Antal Péter
Department of Measurement and Information Systems

Several problems occur while using open-source software. In that case the customers contact the support team and if it's necessary a ticket will be created about the issue. All of these tickets are stored in databases, so there's a bunch of data about them in different databases, in different forms. But this data does not provide actual information about the cause of the issue.

Since solving these problems takes plenty of time, it's worth to apply data science to the historical data and try to get useful information about it and find out what is the reason behind these issues.

To do so, first I had to get to know the software components and their function. After that I investigated in the available databases searching for data that might be helpful to reach my goal. I needed a small part of this data first, so I could come up with some hypotheses as a starting point for further investigation.

Then I chose a suitable programming language and an easy to use environment for my purpose. With the code I wrote I tried to answer the questions that came up while searching in the databases, plotted the results furthermore I looked into the hypotheses premised before.

After accepting or rejecting the hypotheses I drawn the conclusions, I proposed how to improve the result from many possible solutions.

Based on the results from data analysis, we can expect that the suggested changes could reduce the number of errors.


Please sign in to download the files of this thesis.