New approach in analytics of transactional data

OData support
Supervisor:
Gáspár Csaba
Department of Telecommunications and Media Informatics

Analysing transactional data is a widely applied customer analytics tool, which can lead to material and practical benefits. Predicting our clients’ foreseeable behaviour requires a special data mining approach. The timestamps need to be handled appropriately, and we also have to group the records to their owner entities as part of the data preparation. Meanwhile, we create our new attributes by using a novel method (along with two ’traditional’ ones). This is based on the usage of cursor variables, which leads to retrieving hidden information from our transactional events. This is done by a cursor, always pointing to a certain position within the event series.

This cursor operates in a small segment of the series at a time, and creates new attributes based on it. A cursor variable consists of an operator and a calculator part, which are responsible for the motion of the cursor and returning a calculated or found value in the new position. By repeating the motion recursively, we create a huge load of attributes, from which we choose the most useful ones with feature selection. Finally, our models will

be built by conventional model-based classifiers.

One of our goals is to examine the best attributes thoroughly, to get to know what made them being the best, observing the type and the depth of the variable creating it. We also aim to create a methodology based on our experimental metrics and results which can be used for other transactional datasets’ analysis. Another important result of our investigation is the efficiency of the classification models built on the set of attributes.

All of this will be done by Python programming, using scikit and pandas packages, which are essential for carrying out data preparational and data mining processes.

Downloads

Please sign in to download the files of this thesis.