Model partition algorithms for distributed model transformation

OData support
Dr. Mezei Gergely
Department of Automation and Applied Informatics

In the field of model processing, for certain tasks, input datasets can be too large, operations can be too compute-intensive to handle on common workstations/servers, and the most cost-effective, or the only possible method of solving these problems is with the use of distributed computing. To handle the input model data on separate computing nodes, we need to distribute the model efficiently, which is a difficult task in itself.

As part of my thesis, I explain the basic concepts of modeling, pattern matching and model transformation, using graphs as model representation. I choose graph partitioning as the way of dividing the input model between the computing nodes and review some of the existing algorithms in this field.

In my thesis, I examined the problem of model partitioning with the help of an existing distributed model processing framework (DMT) created for research purposes. I replaced the originally inefficient partitioning component, and integrated a well-known graph partitioning software (METIS) into the framework as the partitioning engine.

During my research, I modified METIS's source code and broadened its existing set of graph partitioning algorithms. I created a benchmark suite to measure the effectiveness of these different partitioning methods; the ones included originally in METIS and the ones implemented by me. I examined their combinations, and the effect of their parameter settings.

As input, I generated randomized graphs in different sizes and characteristics, and created subgraphs from a real-life dataset, the online available IMDb movie database.

During these benchmarks I ran METIS in stand-alone mode, measuring only the partitioning phase. I also ran pattern matching in DMT with METIS embedded as the partitioning module, measuring the runtime of the matching.

I present the summarized results of my measurements with diagrams and interpret them in my thesis. Finally, I draw conclusions and choose the best resulting partitioning method combination based on the measured data and my own experience.


Please sign in to download the files of this thesis.