Deep autoencoders and convolutaional neural networks for image classification in practice

OData support
Dr. Tóth Bálint Pál
Department of Telecommunications and Media Informatics

My thesis consists of two main parts; the first part is about supervised learning based on deep neural networks, whereas the second part is about unsupervised learning. Supervised learning is a much more advanced field of deep learning than unsupervised learning. Therefore, I mainly research the theoretical aspects of it.

My research based on supervised learning is focused on creating convolutional neural networks capable of automated detection and identification of invasive plants. Extermination of these invasive species is crucial for both economical and health reasons, as - by having no competitor species - they can spread rapidly endangering local flora, and in some cases their pollens may trigger serious allergic reactions.

This part of my research is based on deep convolutional neural networks. In the last decade convolutional neural networks became one of the most significant automated image recognition and image classification technique. With the dramatical improvement of graphical processors (GPUs), the plenty of digital data and the new scientific results today’s convolutional neural networks can surpass man in image recognition.

In my study I review different image recognition architectures’ theoretical background to find the adequate method for this task. I use 3 types of mainstream and highly accurate (according to literature) convolutional neural networks. I either train the networks from scratch or use pretrained networks and train them further with new classification layers. I optimize the training for two GPUs of different performance.

I evaluate the networks’ outcome using two metrics: accuracy and MAP (Mean Average Precision). According to my current results 70% accuracy can be achieved in identifying invasive plants on the test dataset using these networks.

The second part of my thesis concentrates on unsupervised learning, mainly on the theory and usage of autoencoders. Unsupervised learning means, that the training data is not labelled. The main objective of my research is to examine autoencoders. With the obtained knowledge I try to sort unlabeled images into groups using various types of autoencoders and methods. I also measure the performance on labelled datasets.

Finally, I couldn’t solve the sorting of the pictures with unsupervised learning. I trained an autoencoder for each class and with this method I succeeded, but this can’t be qualified as unsupervised learning because in order the train the classes separately, we must know the classes of the pictures. Nevertheless, my research can be useful for future development of this field.


Please sign in to download the files of this thesis.