Most of today's genotyping processes are not able to produce an appropriately precise result. This is a standard problem with which many research institutions face and struggle, inter alia the GenaGrid Konzorcium (operated by the Semmelweis University, the Budapest University of Technology and Economics, the KFKI Research Institute for Particle and Nuclear Physics of HAS, the Silicon Computers Számítógép Kereskedelmi Kft. and the Csertex Kereskedelmi és Szolgáltató Kft. together).
The consortium aims at revealing correlations between allergies, asthma, various psychological disorders on the one hand and genotypes on the other hand. The Department of Medical Chemistry, Molecular Biology and Pathobiochemistry of the Semmelweis University carrying out measurements uses TaqMan probe for genotyping. 20-30% of their results is vague and therefore can not be definitely matched with genotypes. The amount of missing data calls for decreasing this ratio.
There are, however, several ways to match vague SNPs with the correct genotype. My thesis highlights two of these: (1) clustering measurement results by software and (2) exploiting the linkage information of SNPs for defining the correct genotype. The main goal of my work was to define the optimal fusion of these two methods with a view to optimizing results.