Information technology and digitization takes place in more and more fields of life not only in business applications but also during our everyday life creating more and more data. The generated data is growing in exponential way due to the extremely high number of social networking users. By using actively the features of web 2.0 technology each and every activities of these communities heavily contribute to the rapid multiplication of the created data. Since the costs of data storage and data mining is getting cheaper and cheaper many companies and non-profit organizations realized that collected data could be a rich source of valuable information they cannot afford to loose. Big data means this kind of aspect of data handling which is able to reveal and predict the behaviour of individual users based on the huge amount of collected data. The most frequently applied applications about big data are the recommendation systems. These systems give practical help and guidance for the users in finding relevant contents and finding new ones surprisingly easily.
My diploma work presents the planning, designing and programming process of a music recommendation system. It provides information about Apache Hadoop technology introduced as pioneering the big data revolution because nowadays Hadoop and the term big data are practically means the same thing. Today the world of big data is completely dominated by Hadoop since its distributed operation and its scalability making Hadoop perfectly suitable for solving any tasks which requires big data. The Apache Spark is also described in details, because this software tool was used to implement the recommendation system. After writing briefly about recommendation systems in general and then focusing to the widespread implementation methods of music recommendation systems the next topic is explaining the decisions of the designing process. The original data-set, the way that data moves along the system and the applied technologies are also presented extensively. The last chapter is about the complete solution including the web application which displays the recommendations and finally the accuracy of the recommendation mechanism is evaluated.