Artificial Neural Networks with Tensorflow

OData support
Dr. Strausz György
Department of Measurement and Information Systems

I emphasised two main aspects in my B.Sc thesis. On one hand, I have wanted to deepen my theoretical knowledge in the area of artificial neural networks. On the other, I have wanted to master Tensorflow - a computational graph based distributed numerical framework from Google - through the programming of neural networks. However, it would have been an enormous topic to cover without any constraints, so I have chosen image classification as my topic of focus.

First, I explain the motivation behind my work. The reason why I think that it is still relevant to have profound knowledge of neural architectures if someone wants to use them in a complex application, even though there are a lot of architectures out there ready-made or even pretrained for a specific task. After that I put my thesis into context with a brief historical review.

In the literature research section I have glanced through the ecosystem of Tensorflow and the theory behind neural image classification, covering the most important topics I believe. These were as follows: multilayer perceptrons with and without pretraining, convolutional networks and the hybrid architectures. I have written briefly over two approaches regarding the hybrid solutions. The first one is when one has separately a generative model and a discriminative classifier, in my case this would be a restricted boltzmann machine for the former and a support vector machine for the latter. The second one is when these two are fusioned into one model, which is fascinating to me. I have found two such approaches, the hybrid restricted boltzmann machine (Larochelle 2008), and its stacked version the stacked boltzmann expert network (Alexander G. 2015). In the last section of my research I have highlighted some newer, intriguing advancements in the field.

In the measurements section of my thesis I did plenty of experimentation on some of the models that I have mentioned before, for evaluation I have used the MNIST and the CIFAR-10 datasets. I had to make a decisive decision which parts of a full experimentational setting to neglect. I have chosen to pay little to no attention to the preprocessing of data, optimal early stopping algorithms for training, effective search and evaluation in the hyperparameter space. These could make good candidates for a thesis on their own, but they would have gravely drawn the attention from my main goal which was to experiment on a wide range of models.

I conclude my thesis with a brief summarization of what I have learned from the previous sections.


Please sign in to download the files of this thesis.