Digital Currency Rate Prediction using Twitter Data

OData support
Gáspár Csaba
Department of Telecommunications and Media Informatics

In our information based world the digital cryptocurrencies slowly but surely take bigger and bigger place on the financial market. There are main charactersitics what makes so different the cryptocurrencies compaired to normal currencies – as it is anonim, transparent, decentralized, created by a community and not authorized by third party and it is not on stock exchange yet. The goal is to find the best prediction model using the text mining results from the biggest social media microblogging site, the Twitter posts namely, the tweets.

And another big difference that not the same kind of news and information has an effect on the rates in the fundamental finance analysis. The normal currancies reflect e.g. to GDP, unemployment rates, statements by leading politicians and central banks, central banks interventions, as well as unforeseen events (terrorism, natural disasters).

But cryptocurrancies reflect more to security related issues like vulnerability of storage, lost currencies, bank or government restrictions, Chinese market, taxing, situation of exchange center.

So this is the reason why we cannot use the regular normal currancy market forecast. On the stock exchange the traders get the information what forms the market. The market of cryptocurrancies are more hectical. Mainly unforseen events transform the rates. These informations are widely speading in social meadia. The user generated web content where users can interact and collaborate with each other is really significant. The most leaders are blogs, microblogs, wikipedia pages and forums. Nowadays in May 2014 the most popular microblog is Twitter with almost 1 billion users and about 60 million tweets per day. This is the reason for analysing the content of tweets what about cryptocurrancies mainly about Bitcoin. The goal is the rates prediction for the next hour where we use the previous rates of the currancy, the previous rates of the other currencies and the previous contents or other attributes of the tweets. The evaluation of the methods is by SSE (sum squared error). That prediction model is the best where is the SSE is the smallest.


Please sign in to download the files of this thesis.