Partial automatization of design and implementation of data marts by semantic technologies

Dr. Gajdos Sándor
Department of Telecommunications and Media Informatics

In my thesis I demonstrate how one can use semantic technologies in the process of designing and building data warehouses; more specifically in realizing the data loading processes that move data from a normalized source to the dimensional data-marts.

I explain the background and the fundamental concepts of both fields: data warehousing and semantic technologies. I provide a detailed description of the data loading process, which can serve as a starting point in developing an ontology expressing the conceptual model of the domain.

The most important consideration during the development of the conceptual model was to produce something that is more than just a technical solution to issue. The model is consistent with the fundamental terms of data warehousing and it is extensible in a way that it can be made capable of supporting both the initial construction and the long term maintenance (including the continuous upgrading) of data warehouses.

Several semantic formalisms (and some related tools) are described and evaluated against the needs of the model. OWL and Protégé Frames were the ones that fitted the needs the most, so I implemented the ontology using these technologies. I have developed a Protégé Frames plugin to process the ontology and generate the SQL scripts.

Through an example I explain the way the ontology can be used for the automation of the data loading process and I give a detailed description of the SQL scripts that are generated to execute the actual load process.

The ontology files, the Oracle test database and the executable as well as the source code of the Protégé plugin are all provided as enclosure to the thesis.


