Application of artificial neural networks on music style transfer

OData support
Dr. Pataki Béla
Department of Measurement and Information Systems

Nowadays one of the most popular and rapidly advancing field of Artificial Intelligence is called Deep Learning. This is partially due to ever increasing hardware capabilities allowing for timely execution of programs with greater and greater computational and memory needs and partially due to several theoretical and practical breakthroughs. One of the most important breakthroughs is the relatively recent rediscovery of convolutional networks which allow a locally coherent problem to be described by vastly fewer parameters than normal neural networks would.

Another important quality of these structures is that in certain classification problems, a neural network comprising a convolutional and a fully connected part may learn to select and extract domain-specific features that help the classification problem using its convolutional layers. This phenomenon makes convolutional networks highly desirable to understand and compute features of complex data structures, which features are partially known as style in the case of pictures or music.

These classifier networks have often been used to modify the ’style’ of a picture and make it similar to another one. Several leaps have been made in this field, but style transfer has only ever been subjectively rated.

In my thesis I present the necessary technologies for music style transfer, then implement an algorithm and rate it according to two new, objective metrics. To achieve this, I design and train a convolutional network for music classification and optimise it via a genetic algorithm. Other parameters are optimised with the help of an exhaustive search in the problem domain.

I have managed to achieve a considerable increase in the accuracy of music classification by using a genetic algorithm, while the style transfer and the objective metrics show promising results. The methodology presented herein can be used to solve other problems arising in the field of machine learning.


Please sign in to download the files of this thesis.