Algorithmic Bioinformatics

Katharina Albers

Alignment and retention time correction of LC-MS/MS measurements based on peptide identifications

Academic Advisor: Clemens Gröpl
Degree: Master of Science (M.Sc.)
Status: finished


Develop an OpenMS/TOPP tool addressing the following tasks: (1) correction of instabilities of the retention time (2) multiple alignment of the LC-MS/MS signals caused by peptides (3) compute and maintain a consensus over many experiments.





A standard experimental setup in proteomics research is liquid chromatography (LC) combined with tandem mass spectrometry (MS/MS). In this type of experiment, the protein content of the sample is cleaved (e.g. using trypsin) into smaller peptides ("shotgun proteomics"). These peptides are then separated with respect to their hydrophobicity (other physicochemical properties can also be used) using liquid chromatography and then transferred to a mass spectrometer. Inside the mass spectrometer, peptides are ionized and separated with respect to mass-to-charge ratio. In tandem mass spectrometry, selected ions from the probe can be subjected to collision-induced fragmentation, and a secondary mass spectrum is acquired from these. Peptides will often break at the peptide bonds, and from the resulting fragment spectra it is possible to reconstruct the amino acid sequence using data base search or de novo prediction.

The goal of this Master's project is to develop a tool for creating a multiple LC-MS/MS map alignment based on peptide identifications.

  • Instabilities of the liquid chromatography lead to systematic distortions of the retention time, which have to be corrected. We will implement, apply and compare several algorithmic approaches for this step.
  • The signals in LC-MS/MS caused by an individual peptide need to be combined and grouped within and, most importantly, across the different experiments. We will also implement, apply and compare several algorithmic approaches for this step.
  • Sometimes an experimental setup is used on a routine basis for many related samples. The information from these experiments is then combined into a  "master map", which can be used as a reference to annotate further experiments. We will implement data structures and methods for maintaining and working with such a consensus.

A prototype implementation in Python is available from previous work by E. Lange, R. Tautenhahn, S. Neumann, and C. Gröpl (2008). This master's thesis will be implemented in OpenMS.