DNA short read assembly is the process of determining the sequence of chemical bases in a particular DNA molecule. With the evolution of biochemistry, DNA sequencing methods have become orders of magnitude faster and cheaper. New technologies are evolving rapidly, with near-term challenges including the development of new approaches to data analysis.
My thesis addresses a part of pre-processing sequences, known as short-read assembly: it is a method for comparing shorter pieces of DNA with a reference genome and determining its likely place. When reading small sequence pieces, called reads, the 10-time overlapping is typically used, which, in the case of straight software solution, results in long run times.
In this work, after I take a short introduction to ABI’s SOLiD technology and short-read sequencing software components, I analyze the bandwidth requirements of a PC compatible FPGA based accelerator card, make a proposal for an optimal PC-FPGA connection, decompose this system for software and hardware parts, and finally, I compare the results to those obtained from a straight software solution.