Designing effective convolutional neural network for processing image information

OData support
Hadházi Dániel
Department of Measurement and Information Systems

In previous years, convolutional neural networks have reformed the area of image processing. In many cases, they reached and outrun the accuracy and efficiency of classical human expert-based methods. At the moment, there is no sign of slowing down in the spreading of the neural paradigm.

However, these new solutions are practically black boxes, even for those who generally understand the tools of the neural paradigm. Although the process of learning is clear, the representation of the learned knowledge is hard to interpret. Thus we are only able to measure the accuracy of the solution on a limited available dataset, but it is hard to answer questions like why and in which cases will the output be incorrect. Such uncertainty is not acceptable in the case of safety-critical systems, like self-driving cars, where the output of the system can cause or prevent fatal accidents. It has also been made clear, that besides random failures of the neural paradigm, we also have to be afraid of malicious manipulation of input data.

Further problem is that the state-of-the-art results use neural networks, that have a lot, in some cases hundreds of layers, and millions of convolutional kernels. This high complexity raises the problem of efficiency, besides the mentioned interpretability. Typically learning and storing of hundreds of millions of parameters is required. As a result, both training and the calculation of the output for a given input has great resource requirements. The large and therefore slow networks can have disadvantages in certain industrial applications. For example, we can again think about self-driving cars: if the choice of braking or not is made too slowly, the suitability of the network becomes questionable. So, for practical applications, the efficiency in the number of parameters and resource requirements is a key issue and will be in the future as well.

During my work, I designed and examined neural network structures, with which these problems can be eased. I examined in details the network compression method based on Knowledge Distillation, and the use of Spatial Transformer Network from this viewpoint. Finally, I proposed an adaptation of widely used direct regularization methods for this purpose. I analysed the effects of all the discussed modification qualitatively and quantitatively, and as a result, I showed that significant growth in the efficiency of the constructed network can be achieved with a minimum level of precision reduction.

The structure of the thesis is the following: in the 1st chapter I survey the approaches published in the literature, in the 2nd chapter I describe the Knowledge Distillation based parameter reduction, the 3rd chapter discusses the elimination of redundant filters and in the 4th chapter I summarize the results achieved and deduced the conclusions related to them. The thesis also includes a two-sided appendix.


Please sign in to download the files of this thesis.