Algorithmic Bioinformatics

Sabrina Krakau

Overlap Module for NGS Pipeline

Academic Advisor: David Weese , Marcel Schulz (MPI)
Discipline: Bioinformatics
Degree: Bachelor of Science (B.Sc.)
Degree: Oct 22, 2009
Status: finished




The demand for efficient algorithms in Sequence Analysis is boosted by the recent advent of next generation sequencing (NGS) technologies that usually create megabases of sequencing output in one day. We develop a state-of-the-art pipeline in SeqAn that serves a number of different tasks. An important part of the pipeline, the overlap module, should be the topic for this thesis. The overlap module merges the information retained by read mapping to a genome with annotation information (for example genes, transcripts, known intervalls of genomic abberations like inversions, etc.) and can be used to measure gene expression levels with RNA-Seq data (NGS of mRNAs)[1] or to produce visualisation files for sequencing coverage.



[1] Marc Sultan, Marcel H. Schulz, Hugues Richard, Alon Magen, Andreas Klingenhoff, Matthias Scherf, Martin Seifert, Tatjana Borodina, Aleksey Soldatov, Dmitri Parkhomchuk, Dominic Schmidt, Sean O'Keeffe, Stefan Haas, Martin Vingron, Hans Lehrach and Marie-Laure Yaspo, A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science (2008), published online July 3.