Page GenomeComparisonP4

This is the project page of the Genome-Comparison group.


Mail an alle Gruppenmitglieder: AA2010SS-GenComp bei

Name email
Svenja Specovius
Sabrina Krakau
Felix Heeger
Max Homilius
Ivan Kel
John Wiedenhoeft


Initial reading (this should be read by all)

You can start by reading a recent review review article about genome comparison.

For the core lecture we will deal with the representation of large (kollinear) alignments as graphs. For this refer to

Further reading

For the other two lecture topics you migh find the further reading helpful:

The project


Joint presentation of the projects. Please see svn for updated versions of the lecture slides.


exercise on colinear alignment for Wednesday 12th May

corrected pseudo code of Segment Match Refinement Algorithm

exercise on assembly comparison (OSLay)

Discussion Forum

Please use this space to interface between each other and with the instructors

Proposal outline

  • Colinear Alignment
    • overview
    • match refinment

  • Non-Colinear Alignment

  • Assembly Layout and Comparison

Analysis/Programming Projects

Reannotation of the Genome of Carsonella Ruddii (Ivan Kel).

C.Ruddii is a g-proteobacteria with the smallest genome known so far (approx 214 genes), as well as a very small CG content of 16%. The first complete sequencing and annotation has been done 2006. The goal of this project is the reannotation of the genome using MSA methods and a comparison of the results with the existing annotation. (original paper) and (Moya et al. paper from 2007) Here you can find the details.

Glocal Alignment: SLAGAN and SuperMap (Felix Heeger & Svenja Specovius) (Partners in Instructor team: Manuel Holtgrewe, Birte Kehr)

The first idea of this project was to reimplement the SuperMap-Algorithm presented in the Dubchak paper using SeqAn. Since this would be a lot of work for 60 hours and it needs the SLAGAN-Algorithm, we would like to try and re-implement the SLAGAN-Algorithm and after that, depending on how much time is left, continue to use this for SuperMap. Here you find the details.

Studying sex determination systems using SuperMap (John Wiedenhoeft)

I would like to study the evolutionary relationships of chromosomal sex-determination systems (see using SuperMap. Especially the transitions between systems (XY <-> WZ) would be interesting, since gene mapping suggests that Z in one snake species is not homologous to bird-Z-chromosomes. Here you find the details.

Assembly comparison (Sabrina Krakau) (David Weese and Knut Reinert)

In this project we assemble a set of reads with two different assemblers,compute a layout of contigs and compare the resulting linear sequences with genome comparison programs. Here you find the details.

Non-collinear alignments of viral genomes (Max Homilius)

I would like to study genome rearrangements, horizontal gene transfer etc. in viruses using non-collinear alignments programs. This might be useful to infer phylogenies, and to depict relations. Here will be more details soon.

Possible projects

Please note, that if you are interested in one of those we still have to work out the details. These are some ideas me Manuel and Birte had.

  1. Local alignments Local alignments are often input for genome comparison algorithms. Different local aligners can be used to compute local matches between a number of comparable genomes (e.g. Drosophila chromosomes and shorter sequences, e.g. Adenoviruses). Local aligners are for example BLAT, BlastZ, Blast, Local-Swift, lalign, chaos. In the project we will vary the parameters to obtain local matches of different length and quality. We report on the run time, coverage of each genome, and statistics regarding the length and error rates of the local matches.
  2. Analysis of precomputed alignments Download precomputed alignments from VISTA, UCSC and the Ensemble Genome Browser and analyze key statistics (length, number of identical positions, indel length, etc.
  3. Test the ABA program Use different inputs from project A) and run ABA with it. Report on the differences in output using different local alignments as input.
  4. Evolutionary conservation Download genomes of a number of species with well known evolutionary tree. Use different, suitable genome aligners to compute distances between the genomes as well as gencompress (Chen, Kwong, Li). Construct phylogenetic trees from those differences and report the results.
  5. Repeats in genomes Genomes contain many repeats. Tandem repeats are two or more contiguous, approximate copies of the same repetitive sequence. There are several programs available to compute tandem repeats. Compare their results on a human chromosome.
  6. Assembly comparison Download the read from a fairly well finished genome and assemble it using two or more standard assemblers. Compare the results using layout software and genome comparison programs.
  7. Difference between collinear and non-kollinear aligners Take two well studied genome sequences and compare them with a collinear aligner and a non-collinear aligner. How much difference do you see? (use similar statistics as in 1. and 2.


Topic attachments
I Attachment Action Size DateSorted ascending Who Comment
Annu_Rev_Genomics_Hum_Genet_2007_BlanchetteComputation_and_analysis_of_genomic.pdfpdf Annu_Rev_Genomics_Hum_Genet_2007_BlanchetteComputation_and_analysis_of_genomic.pdf manage 150 K 24 Mar 2010 - 15:50 KnutReinert Overview Genome Alignment
Genome_Research_2009_DubchakMultiple_whole-genome_alignments_without_a.pdfpdf Genome_Research_2009_DubchakMultiple_whole-genome_alignments_without_a.pdf manage 569 K 24 Mar 2010 - 15:51 KnutReinert Multiple whole genome alignment (Dubchak)
2010_recombPOA.pdfpdf 2010_recombPOA.pdf manage 248 K 24 Mar 2010 - 15:52 KnutReinert Multiple POA
chaining.pdfpdf chaining.pdf manage 313 K 24 Mar 2010 - 15:52 KnutReinert Chaining Script
mga.pdfpdf mga.pdf manage 102 K 24 Mar 2010 - 15:53 KnutReinert Multiple whole genome alignment (Kurtz)
MultipleMatchrefinement.pdfpdf MultipleMatchrefinement.pdf manage 201 K 24 Mar 2010 - 15:54 KnutReinert Script multiple segment match refinement and TCoffee heurisistic
PairwiseMatchrefinement.pdfpdf PairwiseMatchrefinement.pdf manage 87 K 24 Mar 2010 - 15:54 KnutReinert Script pairwise segment match refinement
Genome_Research_2004_RaphaelA_novel_method_for_multiple.pdfpdf Genome_Research_2004_RaphaelA_novel_method_for_multiple.pdf manage 621 K 12 Apr 2010 - 08:29 KnutReinert A- Bruijn Graph
Bioinformatics_2007_RichterOSLay_optimal_syntenic_layout_of.pdfpdf Bioinformatics_2007_RichterOSLay_optimal_syntenic_layout_of.pdf manage 325 K 12 Apr 2010 - 08:39 KnutReinert Assembly layout
lecture1.pdfpdf lecture1.pdf manage 1 MB 10 May 2010 - 16:53 UnknownUser slides of lecture 1 about colinear alignment
exercise.pdfpdf exercise.pdf manage 64 K 10 May 2010 - 20:47 UnknownUser exercise for lecture 1 (colinear alignment)
pseudoCode.pdfpdf pseudoCode.pdf manage 41 K 11 May 2010 - 18:30 UnknownUser  
genomecomparison_exercise_3.pdfpdf genomecomparison_exercise_3.pdf manage 102 K 22 Jun 2010 - 14:48 UnknownUser Exercise for lecture 3 (assembly comparison: OSLay)
presentation.pdfpdf presentation.pdf manage 6 MB 19 Jul 2010 - 16:46 UnknownUser Joint project presentation, July 19, 2010.
Topic revision: r39 - 19 Jul 2010, homilius
  • Printable version of this topic (p) Printable version of this topic (p)