Document classification with data mining methods

OData support
Supervisor:
Nagy Gábor
Department of Telecommunications and Media Informatics

In my work, I created a web application, which is able to analyse the text data of a web news portal.

The applications is able to parse and store the contents of the index.hu site, using data- and text mining tools. It is able to categorize the text, given by the user, according to the learnt news stream, relying only on the given texts.

To be able to implement this, I had to learn about the web application technology. I acquired the knowledge necessary about the Python programming language and studied the usage of the Django web framework. I used the results of my earlier works to implement an automated datamining process, which I later on integrated into the web application.

Through the web interface, the user can control the parsing of the news site and can input their own text. Through the help of the integrated data- and text mining procedures, web crawler technologies and editor interfaces in the web framework, the user can achieve the processing of the given data and get results, without outside help.

Downloads

Please sign in to download the files of this thesis.