Object Localization in Images using Deep Neural Network

OData support
Dr. Szűcs Gábor
Department of Telecommunications and Media Informatics

In the 21st century one of our most valuable resource is data. The continuous acceleration of life, increases in technical devices selling, services laying on these gadgets and new corporate and user behaviours result in extended data generation. Handling these data raises multiple questions in the field of technology, ethics, law and economics. Some of these are still unanswered and in the next years presumably many more challenges are about to come. As a computer science student, my aim in this work was to evaluate one field of present technical challenges, namely image analysis.

In the framework of the project, I designed and implemented a system based on a neural network which is able to recognize and localize different objects on new images using a dataset containing photos and labels with semantic informations. In my task these objects are different kind of cars and I predict their location and brand.

The system I implemented is based on an object localization system called YOLO (You Only Look Once). It is part of the Darknet neural network framework which is written in C and CUDA. Currently YOLO gives one of the most accurate and fastest solution for object localization on images and videos. I implemented a workflow which contains all necessary steps to create a trained model including data processing, model training, validation and testing. After that I created an automated version of the workflow which makes the configuration of the system easier and increases its reusability.

After an overview of current literature on object recognition and localization, my thesis presents the implementation, training and fine-tuning of a self-made system based on YOLO. Besides that, I present my measurements of the accuracy of object recognition and localization.


Please sign in to download the files of this thesis.