In preparing my Thesis I have to present some cloud database system, create an application that can estimate the value of real estates, move the created database into a cloud based system and make some measurements to evaluate the differences of database system’s performance.
During my Thesis I investigate three cloud database systems. The main components and the architecture of these systems are evaluated and described.
I’ve created a price estimator application for real estates in C# language in .Net framework. The application is able to download a large number of tenements and build up a database from the data. My application can download every day – for example at 9 pm - the house’s data from the first five pages of the actual search results.
The calculated price values are determined with an algorithm trained on a randomly selected training set. I used statistical indicators like the average, the median, the deviation and the modus of the difference of calculated prices and stored prices to describe the result of the algorithm. The price estimator algorithm’s average fault is about 2-3%, but it depends on the filter parameters set by the user (e.g. we are dealing only with flats in given a district between a given price range 20-25MHuf). The price estimator more accurate if we use it to less database. Some possible further development to increase the appraise effectiveness are also included in my Thesis
For performance enhancement I copied the database into SAP HANA - the chosen in-memory database system. The necessary methods were implemented in HANA too so that the performance of the two databases can be compared. I’ve created an abstract model for the price estimation which was validated by experimental results. Finally, I show the performance differences between the two systems. My experience is that HANA’s performance - in certain operations - can be nearly ten times faster than MS-SQL, but on the other hand some operations like complex mathematical operations are faster in MS-SQL.