The goal of object tracking is to determine the position of a moving object from frame to frame in a video. The significance of tracking algorithms has increased, especially in the last couple of years, thanks to the ever-increasing penetration of intelligent vehicles and various robots. As we usually want the tracking task to be as efficient, fast and accurate as possible, it is logical to use deep learning to solve this task. The object recognition achieved through deep learning can outperform the human baseline, and we aim to use this capability to build a robust object tracking application.
The aim of the thesis is to implement an object tracking task with the aid of a deep neural network. Object tracking was done using a convection neural network pre-trained for image recognition by using the network as a fixed feature vector generator. The idea is that expoliting the generalization abilities of the convolutional network by utilizing the feature vector generated by it, the resulting algorithm will be more robust to the various distortions that cause difficulties for the traditional algorithms that do not use the neural network.
In order to implement the project, I used the PyTorch framework. The framework is a Python function package that is an enhanced, extended version of the Torch framework. The framework enables effective implementation of machine learning algorithms, ensuring collaboration with popular Python packages such as Numpy, Scipy, etc., and provides support for running on GPU. I tested the completed algorithm on several videos and based on the tests the performance of the tracking method is satisfactory.