In our days the Internet offers an immense number of ways for people to share their opinions and feelings with each other whether these concern a product they bought, a service they used, some news they heard, etc. Since people mutually influence one another with their opinions, it would be of great use if one could automatically detect, analyze and rank or classify these opinions in the vast amount of text documents people post on the web.
This thesis -- after giving a general overview of the field of sentiment analysis -- presents the design, implementation and evaluation of a sentiment analysing application. The main goal of sentiment analysis is to determine the feelings or opinions expressed in text documents with respect to a specific topic. Sentiment analysis works with documents written in natural language, so the various phrasings of the documents, the sarcasm, the subtle or devious expressions make the analysing process extremely difficult. Solving the subtasks of sentiment analysis requires the application of data mining, natural language processing and machine learning processes.
The sentiment analysing application introduced in this thesis uses its own database of Twitter messages written in Hungarian language. By giving a search expression one can reduce the set of messages to be analysed by the application to the ones that concern the specific topic described by this expression. The application then assigns an integer number from the interval [-100,100] to each selected message, which number should reflect the feelings or opinion of the person who wrote the message regarding the given topic. Naturally, -100 represents the most negative opinion, and +100 represents the most positive one, while the meaning of the label 0 is a neutral opinion. This can either mean that there were no subjective elements in the message, or that the person who wrote it is pretty much indifferent about the chosen topic.