Modern distributed graph processing frameworks are capable of rapidly processing large datasets and their usage in internet applications is growing constantly. Since graph analysis is a relatively new discipline, there is little comparison among the frameworks. If we run a web store, what platform should we choose to analyze user behavior? We do not know the particular differences in the performance of graph processing systems depending on the choice of parameters.
In this thesis I present the theoretical background of graph processing and compare the performance of Apache Flink and Apache Giraph graph processing systems. I describe the implementation details of the Java application that I have made, which implements 14 multidimensional metrics in both systems, measures their execution time and then compares and analyzes them. The results show that running metrics in non-distributed environments, on one computer
and with moderately large graphs (tens of thousands in order), Apache Flink is capable of better performance.