Fast and Accurate Multiple Structural RNA Alignment by Mathematical Optimization



Financial support:

Deutsche Forschungsgemeinschaft (DFG)


Mar 01, 2007 — Sep 30, 2009


During the last years, the number of genes known for producing non-coding functional RNA had increased significantly, and it was assumed that many more of these ncRNA genes were still undiscovered. Yet, many functional classes of RNA showed little sequence conservation, but rather a conserved secondary structure. A promising avenue to detecting new functional RNAs was thus to look for common structural features shared by a set of related sequences. With this project we intended to advance our recently developed algorithmic theory to compute reliable multiple structural alignments of potentially long RNA molecules under a reasonable consumption of computational resources. We employed methods from mathematical programming such as Lagrangian relaxation and solved the problem as an integer linear program resulting from a graph-theoretical reformulation.

Our results for the case of pairwise structural alignment showed that our current software prototype was among the top programs in terms of speed and alignment quality. Based on these promising preliminary successes we planned to develop a multiple structural alignment tool with different levels of abstraction for RNAs of known or unknown structure with or without pseudoknots, special versions for microRNAs and ITS2 sequences (including clustering), an RNA gene finder based on our Lagrangian approach, and a linear-programming based structural alignment method for the regular pairwise case. The resulting software is freely available.