Interaction of timed data attributes in machine learning

OData support
Gáspár Csaba
Department of Telecommunications and Media Informatics

A time series are such a sequence of observations, when not only the value of the data, but also the order in time have important information. Time series analysis requires special data mining techniques. This thesis give a short overview about the different data mining techniques for the evaluations of time series on the basis of the available literature.

The partial dependence of the prediction derived by the supervised machine learning on selected subsets of the input variables can be visualized on the “partial dependence plots”.

In this work, a Python script was developed in order to plot the partial dependence, and this was tested first on a generated artificial database. The teaching dataset consisted three input variable: two sinus waves and the time points of the observations. The target values were generated by a predefined equation containing the input variables. The partial dependency was plotted together with the target values of the artificial data base. A Python script was written to generate a movie display of the results of the partial dependency plot in relation to the input values from different time windows. In this way the effect of the input values on the predictions can be monitored continuously.

Real database of HUPX (the HUngarian Power eXchange) about electricity power price between 2013 and 2016 was also used for testing the algorithm. The generated animation comprehensively illustrated the partial dependency.


Please sign in to download the files of this thesis.