Virtual and federated databases over implicit and explicit data sources

OData support
Supervisor:
Dr. Ráth István Zoltán
Department of Measurement and Information Systems

Abstract

In a complex enterprise system, the application and the subsystem data are stored in a distributed way and in various forms. In most cases there is a need to know and summarize information about performance and business process which is based on distributed data. It can be a complex issue because of the heterogeneous nature of data, the non-uniform storing methods and data access as well as the semantic differences. For big- and distributed data processing such integration technique is needed which provides the high-level access for applications based on new paradigms and implements efficient high-speed data management in native data models. For this purpose the appropriate level of the integration is a middleware which provides real-time data processing. In enterprise environments a common and widely used data integration solution is to build-ing data warehouses. The disadvantage of this approach that it’s based on scheduled data migration jobs, so it cannot be guaranteed that always the most current information is provided. For the real-time data processing, more suitable solution to construct a virtual database from the available data sets.

In my thesis I prepare the data integration of two existing enterprise systems: an IT ser-vice management system and a project management system. The chosen integration technology is the Teiid, which is a data virtualization application from JBoss.org projects. My goal is to make a study about the real-time data integration solutions based on virtual databases and additionally develop a real-time dashboard application which able to evaluate different key performance indicators based on both systems data. I demonstrate the feasibility through evaluating a few performance indicators with perform the necessary data integration.

As a further improvement of the real-time data integration I extend the implemented system with an additional layer to make possible the incremental evaluation, placed between the Teiid and the client applications. This extension layer is based on the EMF-IncQuery tool, which is known from the model-driven engineering.

To determine the limits of the built data integration system and demonstrate the effectiveness of the incremental approach, I evaluate the system in term of scalability. The main aspect of the measurement is to gain knowledge about the real-time data integration’s goodness. Consequently, during the load tests measured the realization time of the data change events occurred in the source data stores.

Downloads

Please sign in to download the files of this thesis.