You are here: Wiki>LiSA Web>Lara (28 Apr 2015, UnknownUser)Edit

News

Aug 2013
yoshiko version 2.0

We have released yoshiko version 2.0, a mature version of our cluster editing algorithm. It has been implemented mostly by Emanuel Laude and contains a lot of helpful redcution rules from fixed parameter algorithmics and a powerful heuristic for large data sets.

24 Aug 2011
natalie version 2.0

We have released natalie version 2.0, a mature version of our network alignment algorithm.

27 Jun 2011
side chain placement

scp is a package that contains our code for exact side chain placement as used in a recent paper in Optimization Letters.

5 Mar 2010
heinz and BioNet

heinz now works together with the BioNet package.

17 Dec 2008
yoshiko/charles

The yoshiko/charles code is now online. The tool solves the cluster editing problem as well as its directed "cousin", the transitivity editing problem, to provable optimality.

1 Sep 2008
planet lisa moves to Amsterdam

The planet lisa project is now based at the Centre for Mathematics and Computer Science (CWI) in Amsterdam, The Netherlands.

20 Jul 2008
ISMB 2008

Our paper Identifying Functional Modules in Protein-Protein Interaction Data: An Integrated Exact Approach has won the oustanding paper award at ISMB 2008 in Toronto, Canada. See the heinz section for more information on our software to discover optimal subnetworks with respect to our signal-based scoring scheme of p-values.

Jul 2008
ISMB 2008

We will give a presentation at ISMB 2008 on Identifying Functional Modules in Protein-Protein Interaction Data: An Integrated Exact Approach. Our software is available as the heinz package.

Oct 07
Algorithmic Operations Research

Our theoretical lara paper has been accepted for publication in Algorithmic Operations Research.

27 Jul 07
BMC Bioinformatics

The paper describing the lara program has been accepted for publication in BMC Bioinformatics.

28 Jun 07
4SALE support for lara

lara has been integrated into the RNA alignment and editing framework 4SALE. Get the latest lara version that is compatible with 4SALE.

lara

The planet LiSA library is now hosted in "Amsterdam"

lara ("lagrangian relaxed structural alignment") is a tool for the sequence-structure alignment of RNA sequences. It employs methods from combinatorial optimization to compute feasible solutions for an integer linear program.

lara computes all pairwise sequence-structure alignments of the input sequences and passes this information on to T-Coffee which computes a multiple sequence-structure alignment given the pairwise alignments. This is in contrast to plara where we compute a multiple sequence-structure alignment in a progressive fashion.

You will find our latest papers dealing with sequence-structure alignments in the Download section: there you will find an extensive description of our approach, both from a mathematical- and a practical-oriented view.

lara is fully integrated into the 4SALE RNA alignment and editing framework which means that you can use 4SALE to view and edit your RNA structures and let lara do the alignment work.


Quick Start

Download the current version of lara from the Download section. Then, running lara should be as simple as typing

$./lara -i sample.fasta

where the file sample.fasta contains the sequences that should be aligned.

$./lara

gives a usage message where the single parameters are shortly described.


Running lara

By simply typing

$./lara

the following usage message appears:

Usage: ./lara <-i fasta_file [-s]|-d instance_file> [-w output_file] [-o parameter_file]
-i fasta_file       : either fasta or extended fasta file
-s                  : use structure prediction or fixed structure from extended fasta file
-d <instance_file>  : use dotplots from instance_file
-w <output_file>    : name of output file (default: stdout)
-c                  : output a consensus structure
-o <parameter_file> : name of parameter file (default: lara.params)

Let lara create structural information (<-i fasta_file>)

A typical command to run lara would like like

$./lara -i sample.fasta

where sample.fasta contains the sequences that should be aligned. The input file has to be in Fasta format, i.e. the format is

>seq0
GCGUAUCGCAAGGGUUCCC
>seq1
AGCAAAGCGGGCCCGGGGG

lara computes the base pair probabilities for each sequence and uses this information as the structural annotation. In case you want to align only the minimum free energy structures of the sequences, then you have to add the "-s" switch. The command then looks like

$./lara -i sample.fasta -s

Read structural information from dotplots (<-d instance_file>)

If you have the base pair probability matrices in the dotplot format (e.g., generated by "RNAfold -p" of the Vienna package), then you can take these dotplots as lara's input. Create a file that contains the filenames of the dotplots that you want to align, and run lara by typing

$./lara -d sample.instances

where tRNA.instances could look like

seq0_dp.ps
seq1_dp.ps

Provide structural information

If you do not want to employ RNA folding routines to compute the structural annotation of your input sequences, then you can also provide your own structural information. You only have to add the structural information in your fasta file by adding the structure of each sequence in the Vienna format:

>seq0
GCGUAUCGCAAGGGUUCCC
(((...)))..((....))
>seq1
AGCAAAGCGGGCCCGGGGG
.......((....))....

The extended fasta file is automatically recognized by lara.

Read parameter file (-o)

lara.params is the default parameter file for running lara. If you want to provide a different parameter file, then you can do that by adding -o as a command line argument. Then, the lara call would look like:

$./lara -i sample.fasta -o different.params

Write output to file (-w)

lara writes the alignment by default on standard out. If you want to write the alignment into a file, then you can do that by using the "-w" switch. Writing the computed alignment into a file called "foo.aln" is done by

$./lara -i sample.fasta -w foo.aln

If you want to additionally write the consensus structures to the file, then you have to supply the "-c" switch, i.e.

$./lara -i sample.fasta -w foo.aln -c

Parameters

noofiterations

Number of iterations that should be performed. The higher the number, the longer the computation takes. The default value is 500.

noofnondecreasingiterations

Number of non-decreasing iterations: If the value of the Lagrangian dual does not decrease within the specified number, the optimization process is stopped and the best solution is written to the library file. The default value is 50.

my

The stepsize within the subgradient procedure is given by the following equation:

s = my*([upper bound]-[lower bound])/([no of subgradients])

generatorscore

Use the specified scoring matrix (has to be in RIBOSUM format) for generating alignment edges. The default value is 'RIBOSUM65.mat'.

larascore

Use the specified scoring matrix (has to be in RIBOSUM format) for scoring the sequence part of the structural alignment. The default value is 'RIBOSUM65.mat'.

generatorsuboptimality

Parameter for the generation of alignment edges. The higher the value of 'generatorsuboptimality', the more alignment edges are created. The default value is 40.

sequencescale

Specifies the contribution of the sequence scores (specified by the larascore matrix) to the overall structural alignment. The default values is 0.05.

laraGapOpen,laraGapExtend

Specifies the affine gap costs for the structural alignment. The default values for the gap open and extension penalty are -6 and -2, respectively.

structurescoring

Either 'LOGARITHMIC' (then we have score(i,j)=log(p_{ij}/p_{min}), with p_{ij} being the base pair probability between nucleotide i and j) or 'SCALING' ( score(i,j) = *p_{ij} ).

verbosesolver

A value of '1' creates verbose output of the subgradient solver, '0' disables the output.

tcoffee_location

If you want to use your own version of T-Coffee (instead of the one that is provided by LaRA in the tcoffee/ subdirectory), then specify the T-Coffee location by this parameter.

Acknowledgments

If you find lara useful for your own research, please cite the following paper:

Markus Bauer, Gunnar W. Klau, and Knut Reinert
Accurate Multiple Sequence-Structure Alignment of RNA Sequences Using Combinatorial Optimization.
BMC Bioinformatics 2007, 8:271

Further technical details can be found in

Markus Bauer, Gunnar W. Klau, and Knut Reinert
An Exact Mathematical Programming Approach to Multiple RNA Sequence-Structure Alignment.
Technical Report, Department of Mathematics ands Computer Science, Free University Berlin. TR-B-07-07, March 2007.

Markus Bauer, Gunnar W. Klau, and Knut Reinert
Fast and Accurate Structural RNA Alignment by Progressive Lagrangian Relaxation.
In Proceedings of the 1st International Symposium on Computational Life Science (CompLife-05), pages 217-228, 2005.

Markus Bauer and Gunnar W. Klau
Structural Alignment of Two RNA Sequences with Lagrangian Relaxation.
In Proceedings of the 15th Annual International Symposium on Algorithms and Computation (ISAAC-04), pages 113-123, 2004.

lara uses the following libraries and programs: T-Coffee for the computation of a structure-consistent multiple alignment given the structural pairwise alignments, LEDA for the computation of a general matching of maximal weight, and the Vienna RNA package for the folding routines.

Download

version date link description comments
1.3.2a 02 May 08 lara 1.3.2a current release, includes the computation of a consensus structure the same version as 1.3.2, except that lara now can be started from any other directory
1.3.2 23 Jan 08 lara 1.3.2   fixed several bugs and changed gap and sequence scale parameters
1.3.1 28 Jun 07 lara 1.3.1   compatible with 4SALE
1.3 04 Jun 07 lara 1.3    
1.1 22 Mar 07 lara 1.1 version used for computations in BMC Bioinformatics submission and AOR submission journal version
0.9 15 Sep 05 lara 0.9 version used for computations in CompLife paper not supported anymore
Topic revision: r21 - 28 Apr 2015, UnknownUser