Web mining of used car classified advertisements
Consultant: Gáspár-Papanek Csaba
BME Dept. of Tellecommunications and Media Informatics
The used car market is typically that kind of area, where there are large amount of data available. Take for example an online used car ad site, which contains tens of thousands of used car.
Data mining - a relatively young and interdisciplinary field of computer science - is the process of discovering new patterns from large data sets. The goal of data mining is to extract knowledge from a data set in a human-understandable structure.
The goal of this thesis to analyze the possibility of predicting the price of used cars by their features, using learning methods on the set of data of other used cars.
Predictive analytics is a typical field of data mining. The approaches and techniques used to conduct predictive analytics can be grouped into various techniques, including regression techniques. The linear regression model analyzes the relationship between the response or dependent variable and a set of independent or predictor variables. During the data mining process I follow the CRISP-DM (Cross Industry Standard Process for Data Mining) Methodology.
The thesis documents the whole data mining process starting with the business understanding, following by the data understanding, preparation, modeling and testing phase, and ending with the presentation of achievements.