Analysis and Modeling of Short Texts with Deep Learning

OData support
Dr. Gyires-Tóth Bálint Pál
Department of Telecommunications and Media Informatics

Due to the revolutionary increase in the amount of available data, the rise of high performance GPUs and the novel scientific results of neural networks, deep learning has received high attention among machine learning scientists. The numerous layers of deep architectures are able to extract and learn different abstractions of the input data and model them more efficiently than previous machine learning methods in many application fields. Deep learning is one of the main technology of language understanding, natural language processing and time series analysis.

One of the promising theoretical question is the feasibility of deep learning for weak coherent signals, where the feature extraction and therefore the analysis and modeling is difficult with classical methods. (For example market demand or price movement of a financial asset and the corresponding news sentiment, or signals of an IoT sensor network are weak coherent signals.

Digital currencies with an underlying peer-to-peer, decentralized payment system have seen a large growth recently. Most of the coins reward the so called miners, who are offering their computational capacity to enable to run and secure the decentralized services. This earning has a real values on open market, therefore the growing popularity of crypto currencies indicates the growing of the price tag of this rewards. The aftermath of this considerable presence on the market is the appearance of the exchange services for these currencies. These exchanges execute a huge amount of tradings with an opened API.The market analysis is difficult, because the volatility is high, the signal is noisy, and the lack of classical features cause the need of rethinking the well-tried strategies. Furthermore, crypto markets are not regulated.

Due to the large amount of public data from the blockchain and crypto exchanges, data driven machine learning approaches may explore pricing anomalies. Besides the trading data additional sources of information (like political, or financial news) may enhance the predictive capacity of a machine learning model.

Twitter like short texts are typically within the modeling capacity of such analytical systems. As social media, including Twitter, become one of the primary source of news (eg. political, technological, financial news), it has a great influence on masses. Thus, it is critical to be able to model, classify, cluster these texts and eg. to extract sentiment and detect fake news.

I suppose that the price movement of crypto currencies and publicly available data on Twitter are weak coherent signals, therefore, Social media activity may affect the price of the cryptocurrencies.

The goal of this work is to use the deep learning methods to analyze the behavior of crypto markets, including the price movement and Twitter activities.

I apply deep learning based natural language processing (NLP) techniques on Twitter feeds, I build a deep learning model for crypto currency price movement and complete this model with the NLP methods. The evaluation of the feasibility for weak coherent signals are also introduced.


Please sign in to download the files of this thesis.