Performing interactions on web user interfaces with deep Q-learning

OData support
Dr. Gyires-Tóth Bálint Pál
Department of Telecommunications and Media Informatics

In recent years with the development of GPUs with more parallel computing capabilities it has become possible to solve computing power demanding problems a lot quicker. Teaching Neural Networks on GPUs has made experiments a lot faster compared to using CPUs only, because these models contain a huge amount of parameters that has to be calculated, and the parameters in a layer are independent, so computing these parallel is possible. In the meantime bigger data sets also became available, these also contributed to the success of applying deep architectures.

With the help of deep learning many problems could be solved with less feature engineering and with better accuracy. This was one of the reasons why it got a bigger attention in both the researchers' and the public eye. Today deep learning is one of the most researched areas of machine learning.

In recent years the company DeepMind achieved a significat breakthrough. They combined deep learning with Q-learning. With this new method they developed agents that could play ATARI games better than humans using only the pixels as raw input. Research into reinforcement learning has also risen, and new methods were developed in this area too.

Deep reinforcement learning can be used for other problems too. In my thesis I investigate the usefulness of the techniques of deep reinforcement learning for agents to interact with graphical user interfaces. I use the Mini World of Bits benchmark to train and evaluate these techniques.

I experiment with and evaluate effectiveness of the Deep Q-Network algorithm, and apply these to problems of different difficulties.

Finally I analyze the possibility of extending my solution to other web pages outside of the benchmark.


Please sign in to download the files of this thesis.