Recently data mining has become used worldwide because people can find find new, helpful and non-trivial connections between common things.
One of data mining’s most known methodology is CRISP-DM(Cross Industry Standard Process for Data Mining) which is used to look at the objectives from a business point of view and find new techniques that way. This thesis is only about the modeling and evaluating part of this methodology.
One of the most known products of data mining are recommender systems. They not only suggest us films, music and books but they also recommend people on social networks who might be our acquaintance.
Because the markets are getting flooded with more and more things, we desperately need better and better recommender systems since we don’t want to waste our money or time for things that even a machine can tell we won’t like and we don’t want to miss things that we would like, just because we haven’t heard of them.
Making these systems would be much harder without Hadoop which helps us process a massive amount of data on systems with distributed resources.
The goal of this thesis is to familiarize this platform and show the basic algorithms of recommender systems.