The appearance of data mining over the past decade has made it possible in business-, engineering- and science life we can more efficiently trust in accumulating data with incredible speed. Data mining is a decision support progress that discovers valid, useful, brief and previously not known information from huge databases. In addition to discover hidden correlations, with various tools of data mining expert decisions often can be automated, so we can reduce human resources and business or technological progresses are held less time than regularly.
In my thesis I will create a database of real estate market from raw data then analyze with various methods of data mining using SPSS Modeler 14.1 software. The purpose of analysis is to create a model which can gives efficient forecast about selling price per square meter of a real estate.
In my subject the biggest specialty and complexity also came from the target variable, which has continual type parameter. Because of this, it raises some issues and problems, for example: when will the model be efficient and acceptable?
In the first chapters I present the functions of the software I used. I review the database itself including its structure and the data cleaning process of the available data.
The second part of my thesis is about the presentation of the algorithms that I used during the modeling process and the creation of the models. I will compare the models so that map their accuracies and efficiencies.
Finally, I will summarize the results, which helps me to choose the best method to get the target, the forecast about the selling price – per square meter – of a real estate. I will pan out about future utilization possibilities too.
I had a presentation about this thesis and the results at SPSS Summer Conference in Veszprém, Hungary July 2010.