Monitoring and Performance Measurement of Fission Serverless Platform

OData support
Supervisor:
Dr. Maliosz Markosz
Department of Telecommunications and Media Informatics

Function-as-a-Service and serverless terms have gained considerable popularity in recent years. Together, they describe platforms that significantly facilitate the life of a developer by running program portions in a distributed system, without the developer having to manage the background infrastructure or even be familiar with it. In addition to commercial solutions, several open-source projects have been created, one of which is Fission.

Fission is a Kubernetes native serverless platform that has expanded with a lot of features since it was launched, which formed it to a remarkable option. My task was to study Fission deeper, which includes getting to know the background mechanics, experiencing the usability and examining the efficiency of the function execution in Kubernetes pods with and without auto scaling.

The first step was to create a Kubernetes cluster from virtual machines under home conditions, and to configure it properly, and then I have installed Fission with the related monitoring services, Prometheus, Grafana and Jaeger. I have documented all the steps during the installation, including the solutions to the problems that have arisen. Then I examined the operation of the platform and the possibilities to use it. Preparing for the time measurements, I developed solutions to save the results from the benchmark applications (Hey, Wrk), and to export all data generated during the measurements from Prometheus monitoring system as a snapshot.

The goal was to investigate the efficiency of the Fission platform, for which I got access to a high-performance server cluster that served perfectly for accurate measurements. The use of a separate server from the cluster for the measurements, which did not interfere with the operation of the cluster, also contributed to this. I have created a script that, on the one hand, preserves me from carrying out the complete measurement manually, and on the other hand, gives an easy way to be able to repeat the whole process even in another cluster. I configured the monitoring services to be able to monitor the CPU usage and function execution times during the measurements.

From the measurements it can be seen that Fission can operate with low latency, but at higher load tests requests may return with errors without scalable pods with sufficient resource limits. The default settings without auto scaling are good for an average usage, and the latency even of the first function executions is small. Auto scaling is working as intended, thus on-demand services could be created with it. Overall Fission is a great open-source project with useful features and fast function invocation, but anyone who wants to use it has to invest a lot of time in it, including the cluster maintenance, installment of monitoring services and the learning curve of Fission itself.

Downloads

Please sign in to download the files of this thesis.