In my thesis I would like to familiarize the reader with the basics of real-time stream processing systems through introducing some of the current popular systems in this area.
I start by introducing two open source solutions: the architecture of Apache Storm and Apache Samza and also how they operate, their configurational possibilities and finally the differences and similarities of the two systems will also be pointed out.
I will plan and implement a universal performance measurement software for stream processing systems. Using this tool I will test the two formerly introduced systems and also including a third commercial product named IBM InfoSphere Streams. The results will be visualized and evaluated based on performance.
Finally in order to test the parallelized operating mode of Storm and Samza, I will create an internal testing solution, with which we can measure the borderline bandwidth performance and scalability of these two systems. The results of these measurements will also be visualized and evaluated accordingly.