Sentiment Analysis in Hungarian

OData support
Ács Judit
Department of Automation and Applied Informatics

Data volume is growing in a significant manner at the age of infomartion society, and it indulges in new technologies to be spread all around our entire lives. Now the era of advanced analytics is here, which makes it able to fully automatize tasks around all use cases. Natural Language Processing (shortened form: NLP), and its subset called sentiment analysis are claimed to be one of the most thriving research topics, which can be reinforced by increasing number of scientific and enterprise implementations. There is a growing market need for sophisticated entity-based sentiment analysis solutions, which are capable of mining more details to entities (proper, geographical and institute names) from textual parts. An interesting use case is created by one of the largest videostreaming company on the globe, to support interactive way of streaming with integration of real-time tone analyzer.

My thesis is on implementation experiments of an entity-oriented sentiment analysis solution, which is able to ingest Hungarian text. I created machine learning model – taught on open-source OpinHuBank corpora – and an application, which is ready to extract polarities regarding to occuring entities. Alongside development I aimed at a solution that is adequate performing on real life examples. That is why I always doublechecked at every model evaluation whether its precision is fine on those samples. After selecting optimal configuration I implemented an Application Programming Interface for intuitive usage, furthermore the whole solution was wrapped and opened to public in form of a Docker image. As there hadn’t been any similar kind of realization, I decided to make it open-source, hoping it could help someone.


Please sign in to download the files of this thesis.