Parking Spot Recognition and Visualization with Semantic Segmentation

OData support
Kovács Viktor
Department of Automation and Applied Informatics

Automated parking is an important step towards fully driverless cars. However, there are several tasks that need to be accomplished in order to achieve automated parking. First, the car needs to be able to detect nearby free parking space accurately, based on its sensors (camera, ultrasonic sensor, etc.). Then, the trajectory of the parking manoeuvre has to be computed. Finally, the car has to be parked while avoiding collision with any other objects. My work focuses on recognizing free parking space based on the video feed of a car camera.

Deep neural network based models have become prevalent for tasks involving image processing, like object recognition. Furthermore, convolutional neural networks (CNNs) have achieved impressive results in image semantic segmentation. In semantic segmentation the goal is to classify each pixel of an input image into different categories, for example cars, road, pedestrians, buildings etc. With today's computing power, semantic segmentation techniques based on CNNs can achieve sufficiently accurate performance in real-time.

The aim of this work is twofold. First, I design a user interface where the user can interact with the video stream of the camera. I implement features, like allowing the user to intuitively specify parking spot locations on the camera picture. Furthermore, I implement an algorithm that computes the specified parking spot's location and orientation in the 3D world relative to the camera. I implement this user interface in python using the OpenGL library.

The next section of my paper focuses on experimenting with an implementation of the SegNet neural network model. First, I integrate the aforementioned user interface program with the semantic segmentation model. Then, I adapt the implementation to the task of segmenting free parking space. Furthermore, I make several changes to the architecture of the model, and I experiment with a number of training setups. One of such changes is the replacement of normal convolutional layers with depthwise separable convolutions. I also look at how changing the size of the training dataset affects the performance of the model. Finally, based on hundreds of trainings, I conduct an in-depth analysis of how the changes that I have made affect the model's accuracy and inference speed.


Please sign in to download the files of this thesis.