Analyzing user behavior with Elasticsearch

OData support
Dr. Dudás Ákos
Department of Automation and Applied Informatics

Because of the expansion of IT systems and the growth of the number of users, nowadays companies are increasingly paying attention to the handling and analysis of large amounts of data. When developing a system, it is not enough to rely solely on customer needs, but also the needs and behaviors of users have to be recognized to make the most effective change. The Elasticsearch search engine can be a good solution for distributed storage and efficient analysis of large amounts of data.

In my dissertation, I implement the storage and analysis of user interactions recorded in a digital teaching-learning portal (ProBono) using Elasticsearch and Spark. In Chapter 1, I will present the purpose of the dissertation and the requirements of the system then in the next section I am going to examine the possibilities of implementation. Among the many log management tools I choose Elasticsearch Logstash Kibana (hereinafter ELK) system set. In addition to these, I intend to use the Spark for machine learning processes and for the implementation of the recommender system. In Section 3, I describe Elastic Stack tools and their major functions and usage. In the middle of the thesis, I am going through a series of planning and system configuration processes. During this I build the entire ELK system to process log files from different systems and make relevant reports from them. At the end of Chapter 4, I introduce a more complicated pipeline to make a rough estimate of time spent in the system by the users, based solely on audit logs. In Chapter 5, I am presenting the basics and main types of currently used recommender systems. In the last major chapter of my dissertation, with the help of Elasticsearch and Spark I create a recommender system for ProBono.

At the end of my work, I evaluate the experiences of using Elasticsearch and further develop areas, that can have a positive impact on workflows and make the tool more secure.


Please sign in to download the files of this thesis.