AssignmentWeek2 < ABI

You are here: Foswiki>ABI Web>LectureWiki>AdvancedAlgorithms12>AssignmentWeek2 (20 May 2012, SandroAndreotti)Edit Attach

Exercise 2

Read mapping with QUASAR

Deadline: 23.05.2012 9:00 a.m.

Implement the QUASAR algorithm for to improve the efficiency of you read mapper in Exercise 1 So you need to:

Efficiently create a q-gram index of the genome.
Divide the genome into blocks
Search for matching q-grams and count matches for each block
For each block where the matches exceed the threshold (q-gram Lemma) use the semi-global aligner from Exercise 1 for verification.

Format specifications
The input and output file formats and outfile name are the same as in Exercise 1 The program usage should be:

./exercise2 <genome.fasta> <reads.fasta> <k> <q> <b> where:

k is the allowed maximal edit distance of a match.
q is the length of the q-grams.
b the block size for the genome.

Remark: The choice of q in testing will always respect the memory usage of a q-gram index without hashing (table size 4^q).
Your implementation should allow for a length of 12 at least on a normal desktop.
If the block size is smaller than the read length + k your program should write a warning "read length exceeds block size" into the result file and terminate.
If for the choice of k and q and the read length the calculated threshold is not greater than zero the program should write a warning "Bad choice of k and q for input read length" into the result file and terminate.

For the application note you should test different values of k, q and b and measure their effects on the runtime. Compare to the runtime in Exercise 1

I	Attachment	Action	Size	Date	Who	Comment
gz	DataExercise2.tar.gz	manage	6 MB	20 May 2012 - 16:20	UnknownUser	Data for Exercise 2

Topic revision: r5 - 20 May 2012, SandroAndreotti

ABI

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback