My thesis is about applying reinforcement learning in complex video game environments. Combining reinforcement learning with high dimensional image inputs was a major breakthrough in the field of machine learning, in my thesis I describe the necessary knowledge and related practice to implement such a program. The methods described are based on the publication by DeepMind, in which they achieved outstanding results.
The goal of reinforcement learning is to build an optimal strategy in an environment. One method is to optimize the assigned Q values to state-action pairs, based on the information received from the environment. Similarly to the publication this is achieved using a deep convolutional neural network for learning Q values from images and rewards provided by the environment. After adequate learning the predicted Q values are optimal, meaning that choosing the action with the highest value results in an optimal strategy.
In my thesis I briefly describe different ways of improvement, which can further accelerate the learning process. These methods were published after the original publication and their application would results in remarkably faster learning.
Due to the fact that this implementation needs a very high and expensive computing capacity, I adopted a simplified solution. Choosing this option is not an obstacle to achieve results close to human performance. Finally I present how the learned Q values could be understood as part of a strategy.