Distributed Processing of ATLAS Job Monitoring Data

OData support
Dr. Ekler Péter
Department of Automation and Applied Informatics

This dissertation will be about my work done at CERN as a technical student. As part of the monitoring team I was assigned to replace the system responsible for calculating historical and real-time statistics of the jobs submitted to do physics simulation and analysis tasks.

For this work we have chosen to use the Spark distributed processing framework, on which I have implemented three different applications using the Scala programming language.

These applications were tested on internal clusters processing the same data as the current live systems, and the results were compared to those.

Furthermore my work also included configuration of virtual machines and additional software, such as Apache Flume, Apache Kafka, Mesosphere Marathon, Mesosphere Chronos, and Docker.

As a result of my work, a fully automated system is making sure the three Spark jobs are running as desired, processing around 30 gigabytes of data per day, performing multiple aggregations and enrichments with external data sources.

In the first chapter I will give a short introduction on the context of my project, and computing at CERN.

Chapter 2 is the problem statement, describing the initial issue of my work, and the solution which I have implemented as my thesis work.

The third chapter will explain the history of running Spark jobs in our team, advancing from the very basic manual submission to the dedicated frameworks.

Chapter 4 is all about collecting the necessary input data for further processing. There will be a detailed intro to the usage of Flume to periodically fetch rows from databases, and forward events to multiple output sources.

Chapter 5 will introduce to the reader how job monitoring works now on Oracle and the

Spark processing pipeline, and then cover the implementation of the Spark job responsible for the real-time monitoring data.

After Chapter 5, the next two chapters will summarize the realization of the two accounting processing jobs.

Chapter 8 contains all the obstacles I have found while working on the Spark framework, covering some of my findings and solutions to them.

Finally, in chapter 9, I will draw a short conclusion about the work covered in this thesis.


Please sign in to download the files of this thesis.