The goal of this thesis is to inspect and model the processes behind the housing market, using machine learning. This is being achieved through the analysis of classified advertisements from a prominent housing ad website in Hungary. Since I am concentrating on flats situated in Budapest, the conclusions I draw will be relevant first and foremost on a local level. Nonetheless, these findings might be applicable on a macro scale as well.
The direction I am taking in solving this problem is training models from three different algorithm families, and comparing the results achieved by them. My main goal is having the most accurate predictions possible regarding the price of a flat with given parameters. Furthermore, by analysing the output and the internal variables of the trained models, I aim to uncover the connection between different features of a property, as well as how these influence its market price.
First of all, I go over how the data was collected, then discuss the data preparation and processing stage, where I introduce computed features into the dataset, using specific properties of the domain. After that, I introduce the three used models in detail, and show their relevance in finding a sufficient solution for this problem. Finally, I examine the circumstances and environment of the training process, inspect the interactions learned by the models, and present the final results.
Of the trained models, the most accurate can predict prices with a meaningfully low error level to be relevant, even from a business perspective. Having the ability to reliably compute housing prices can drive up market efficiency, by closing the gap between supply and demand. If advertisers can determine the realistic price of the property being sold, the time it spends on the market can be shortened significantly. On the other hand, if prospective buyers are able to find out the accurate value of a property, that helps them avoid overpriced, and locate favorably priced homes.