Modell based data preprocessing for analysis

OData support
Dr. Pataricza András
Department of Measurement and Information Systems

In my thesis I tried to find a solution for the simplification, acceleration and automation of the process of measurement data processing.

The main research objective is a solution for simplification and automation of measured data processing. The most frequent and time-consuming problem means 80% of the time of data processing.

Syntax checks on the data can be easily automated, but at the same time external knowledge is often required in case of syntactically correct but semantically erroneous data. To integrate this into the automation control process can be critical when extremely large data sets arise.

This way, automation requires the representation of both data and semantics related information in some standard form ready to automated processing.

One of solution alternatives is the representation of semantic information in ontology and its harmonization with the database containing the measured data set. This interconnection has additional advantages which originate in the fact that the knowledge defined by ontology can be mapped to database by standard methods. In this case, hierarchy and relations can be mapped without information loss; moreover there are opportunities to handle derived data, as well.

Merging data from the database with those in the ontology needs the implementation of a variety of operations (search, select, update) controlled by the ontology content.

Semantics driven cleaning of the measured data as controlled by the content in the ontology needs the composition of a special procedure from the elements of the earlier mentioned environment of database operations. The mapping of the semantics needs an additional database storing a filter and control layer. The results of monitoring are firstly the list of wrong data pre-diagnosed by the verification process and the clean data after repair.


Please sign in to download the files of this thesis.