Distributed systems are a frequent research topic to address the problem of computation scalability. It is a lasting design pattern to solve large computation problems that could not be solved with the capacities of a standalone machine, even if it has the largest capacity of the current technology. This way, computation tasks and data can be split to different machines to avoid resource exhaustion and data is exchanged through the network. However, the efficient allocation of computational resources, which in this context means the assignment of resources to tasks, is not a trivial challenge, as different allocation variants can yield different system characteristics. These system characteristics can be performance (computation time) and other desired aspects. In order to reach possible better allocations, the allocation can be optimized.
The goal of the thesis is to give a mathematical formalization for the allocation problem and to solve two possible objectives, Communication optimization (which aims to reduce the overhead of network communication originated from data transmission) and Cost optimization (which aims to reduce the monetary cost of computational infrastructure, i.e. resource consumption in public cloud, required for the computation).
The allocation optimization problems are formalized as combinatorial optimization problems and are solved using constraint satisfaction programming (CSP) techniques.
The proposed allocation solutions are evaluated by the case study of distributed model queries that are data- and resource-intensive applications for distributed systems.
IncQuery-D, a distributed incremental model query engine will be used as a target for
implementing the proposed allocation models. We will use heuristics based on the
behaviour of the system to estimate different characteristics required by the
allocation. For IncQuery-D, a monitoring system is developed as well to justify the
results of allocation and to help implementing the heuristics.
We wish to justify the impact of the optimization techniques on the query engine
performance by measurements. These results can be viewed by the IncQuery-D
monitoring system that we created for the query engine. Further achievement is
that the monitoring system and the optimization facilities are fully integrated
to the IncQuery-D's own development environment. Furthermore, the performance of
the solution algorithms will be observed and conclusions will be made in