Distributed application management with container-based virtualization

OData support
Supervisor:
Dr. Ráth István Zoltán
Department of Measurement and Information Systems

The goal of this thesis was to design a distributed, Docker-based infrastructure which could support the implementation of pre-specified paradigms (Infrastructure as Code, Immutable Infrastructure, Continuous Integration & Deployment).

A prerequisite for the design phase was to choose a distributed Docker implementation to suite the needs of an existing application environment. Two alternatives were examined, Docker Swarm and Kubernetes. As these two implement very similar features, the basis of the decision was their global penetration and available support. Based on these criteria, Kubernetes was chosen as the framework for the infrastructure.

Resource allocation for machines running the Kubernetes infrastructure was based on the current application environment. CoreOS was chosen as the operating system for these machines, as it has built-in provisioning and configuration automation capabilities. An external HTTP server was set up, which would provide access to the binaries and configuration files necessary during machine provisioning. Resource sizing and the number of components responsible for cluster management tasks was based on the needs of the current environment. With that in mind, a single Master Node was deemed sufficient.

To ensure high availability for the application running on the Kubernetes cluster, it was necessary to choose a distributed storage implementation. Two alternatives were examined, Ceph and Gluster. The basis of comparison was the I/O performance of the local LVM file system. For measurements, a tool called fio was used. Based on the gathered data, Ceph proved to be the better alternative and it was chosen as the storage backend for the Kubernetes cluster.

The first application to be run on the Kubernetes cluster was called Replay. In preparation for its migration, all Kubernetes configuration object descriptors were defined in advance, as well as helper scripts for the migration automation and a migration plan. The test environment was migrated according to the steps described in this document, during which no significant deficiency was discovered. Post-migration analysis provided several optimization possibilities, the most significant being the change of the database migration method.

Downloads

Please sign in to download the files of this thesis.