For webmasters operating websites with business intention it is indispensable to tightly monitor the performance of their own and their competitors’ sites. Users facing a search engine result page are deciding with just one click about which company they choose. At the introductory part of my thesis I took into consideration all the tools, which are helping decision-makers by monitoring websites. It turned out, that available solutions are either hard to personalize, or they unable to handle large amount of data.
I started to design a system to address this need, which is capable of handling large amount of data, and it is flexible enough to be easily expandable with new data sources.
I chose MongoDB, which is storing data inside collections in documents with JSON format, contrary to the traditional relational database management systems, which are storing data in tables and rows. I chose this technology because of its flexibility, but it is useful in distributed architecture as well. The database can be easily scaled horizontally without changing the application layer top on it.
I presented this new technology in the remaining part of my paper, and I made a flexible and extendable system using PHP and MongoDB. The core of it is a queue system and a modularly extendable website analyzer class.
Testing of the system revealed some drawbacks of this NoSQL technology, like lack of related tools because of its immaturity, or the bigger responsibility on the application logic resulting from denormalization.
Although the software was able to answer real world problems, it still has lots of room to improve.