Developing a knowledge-based text mining system

OData support
Dr. Mészáros Tamás Csaba
Department of Measurement and Information Systems

The data analyzing tools are gaining importance and becoming more widely used with each passing day and they pose new challenges in the world of computer science. In my work I mainly focus on two aspects that are more and more relevant. First of all, the efficiency of text analyzing tools and search engines often does not meet the require-ments. Secondly, the majority of users do not have the necessary background knowledge in IT to operate these tools. That is the case in fields of natural sciences and literature where the quality of data is often insufficient for the currently widespread tools.

In the present work I propose a solution to the aforementioned problems by utilizing some of the less generally known tools. In doing so, I have developed a knowledge based text analyzing system with a natural language interface.

The system allows knowledge based analyzing of the uploaded documents. The neces-sary semantic information is stored in RDF format. In this work I present the improve-ment in results when using this approach as opposed to the classical keyword based searching and illustrate that text analyzing and stylometric tools also benefit from it.

Another important benefit of the software is that it provides a natural language interface that processes user requests and translates those. Therefore, makes it easier for the users who lack programming skills to utilize these tools. In this case I have created an inter-face for text analyzing tools that are often used by literarians.


Please sign in to download the files of this thesis.