Applications of data mining in empirical finance

Zlatniczki Ádám
Department of Computer Science and Information Theory

In my thesis I forecast stock price movement wherefore to create a profitable investing strategy. For the forecasting I use data mining techniques, especially classification. My goal is to build such a binary classifier that tells us whether the given stock price will increase or decrease tomorrow. I use the previous days’ closing prices and the previous days’ movement directions during the model construction. I try out various kinds of classifiers for the strategy building like decision tree, random forest, Bayes-classifier and Support Vector Machine (SVM). I compare the results of the different type of classifiers with various performance indices. Then use the best model to build an investing strategy and compare my strategy against a benchmark.

To achieve this results, I read some articles about similar researches as mine to collect ideas and analyse theirs results. I processed the theoretical aspects of the applied techniques and I wrote a summary of them. Then I downloaded the AT&T stock prices from the Yahoo! Finance and made some descriptive statistics to be familiar with the data. After this I used the data from AT&T stock through several iterations to find and improve the results of the four classifiers. In the end I defined the Support Vector Machine as the best classifier, therefore I used that to predict movements and built an investment strategy based on that. I compared my strategy against the Dow Jones Industrial Average. As curiosity I tried to find association rules among the stocks movements. My goal with it was to explore connections between the stock price changes.


