Machine learning methods for increased usability of user interfaces

OData support
Dr. Csorba Kristóf
Department of Automation and Applied Informatics

This thesis has introduced the CV4Sensorhub general image processing framework from the architectural and application point of view. I have chosen two tools which can be enhanced by using machine learning. The first tool is a Monte Carlo-based algorithm for segmentation called RJMCMC which stands for Reversible Jump Markov Chain Monte Carlo. The goal is to find the borders of the grains in the marble thin section image to foster the marble provenance. I have implemented this algorithm and tested on both artificial and real marble images as well. Furthermore, some slight modifications enabled me to try segmenting slate images with good result too. Unfortunately, the marble images which contain twin crystals are difficult to cope with and further refinements in the energy terms (both on the parametric and structural side) should be necessary to tackle the problems. First, there are a lot of hyper parameters to adjust and may be a machine learning algorithm should be able to automatize their adjustment and find better parameter settings. Second, the whole segmentation can be addressed by a neural network based algorithm. The second tool is the so called live-polyline which is a smart drawing tool. Basically this is also for marble thin section image segmentation but it is more general then RJMCMC. Generality in this context means that it should be able to operate on different types of images (marble, white blood cells, concrete etc.) and beside different contrast properties. The thesis proposes a traditional approach without machine learning and another one with machine learning.

In order to apply the machine learning methods inside CV4Sensorhub a suitable framework was necessary. Due to the fact that reinforcement learning (RL) is the most general learning framework I have put close focus on RL algorithms during the design process. The currently available state-of-the-art technologies for machine learning related tasks (Tensorflow, PyTorch) and their relations to the .NET Windows environment were taken into account as well. The principal approach of the framework is that it separates the training from the usage. Training is executed on Linux based GPU server and the code is written with Python. The usage happens in the C# environment and therefore I have created a special library named as NNSharp.

The machine learning supported versions of live-polyline are a supervised- and a reinforcement learning based approaches. To find candidates for algorithms I have thoroughly investigated the field of reinforcement learning. As a starting point I have decided to deal with DQN (deep Q-network) and tried to replicate the results in the Atari test bed on the game Breakout. It had taken way more time than I expected but the results are appropriate. I have implemented the versions of live-polyline but I have not achieved satisfying results.


Please sign in to download the files of this thesis.