- 28.09. (Weese): The second examination (Nachklausur) will take place on Wednesday,
**19.10.**, 2-4 pm, in room 005 (T9). - 12.07. (Weese): The examination results are online.
- 31.05. (Weese): The examination (Klausur) will take place on Wednesday,
**29.06.**, in room 006 (T9). - 07.04. (Weese): Please sign up for this lecture at https://www.mi.fu-berlin.de/kvv/course.htm?cid=9685&sid=22 (updated URL)
- 21.03. (Weese): The first lecture will be given on Monday, 11.04., at 2:15 pm, room SR FB (1.1.16) in Arnimallee 14.
- 21.03. (Weese): Important/urgent information will be distributed via our mailing list. Please sign up at https://lists.fu-berlin.de/listinfo/bioinf-P4-2011

Mat | Pts | Grade |
---|---|---|

xxx4940 | 62,5 | 2,0 |

xxx1192 | 27 | 5,0 |

xxx0760 | 0 | 5,0 |

xxx9983 | 0 | 5,0 |

xxx2042 | 45 | 3,3 |

xxx2152 | 49 | 3,0 |

xxx1288 | 0 | 5,0 |

xxx4523 | 0 | 5,0 |

xxx0284 | 39 | 3,7 |

xxx2000 | 71 | 1,3 |

xxx2779 | 0 | 5,0 |

xxx6890 | 62,5 | 2,0 |

xxx6034 | 39 | 3,7 |

xxx1235 | 65 | 1,7 |

xxx9546 | 68 | 1,3 |

xxx9672 | 57 | 2,3 |

xxx9469 | 43 | 3,3 |

xxx2200 | 43,5 | 3,3 |

xxx4248 | 43 | 3,3 |

Event | Day | Time | Address | Room |
---|---|---|---|---|

Lecture | Mon | 14-16 | Arnimallee 14 | SR FB (1.1.16) |

Lecture | Fri | 14-16 | Takustr. 9 | 006 |

Exercise | Wed | 14-16 | Takustr. 9 | 006 |

Date | Lecture Block | Lecturer |
---|---|---|

11.04.-06.05. | Online String Searching and Filtering | David Weese |

09.05.-30.05. | Index Data Structures | David Weese |

11.05. | Review 1 | |

03.06.-17.06. | Genome Comparison and RNA Analysis | Sandro Andreotti |

08.06. | Review 2 | |

29.06. | Klausur | |

18.06.-16.07. | Individual Projects |

- Student 1
- Read in a text from disk
- Construct the suffix array with Quicksort or an algorithm of this lecture
- Construct the lcp-table with the Kasai algorithm
- Write the 2 tables to disk

- Student 2
- Read in a text the corresponding suffix array and lcp table from disk
- Implement the suffix array search with mlr-heuristic
- Implement the O(m + log n) suffix array search and therefor construct an lcp-interval tree
- Compare the search times of both algorithms (use a genomic text and a large set of substrings)

- Student 1
- Read in a text from disk and construct its suffix array (you are allowed to use the results of Project A, you could also read it in after executing the tool of Project A)
- Construct the Burrows-Wheeler Transformation (BWT) using the suffix array from Project A
- Construct the Occ and c-table
- Provide a function for the L-to-F mapping (used by Student 2)

- Student 2
- Implement the exact match algorithm that uses BWT, Occ and c table to search a string in a text

- Student 3
- Use the exact match algorithm of Student 2 and backtracking to implement an approximate string search with up to k errors

- Student 1
- You are given the name of a text file, a pattern P and the number of edit-errors k on command line
- Read in the text from disk
- Construct the hierarchical tree as described in the PEX algorithm

- Student 2
- Write a semi-global DP-aligner to search a substring of P in a substring of T with edit-errors
- Implement the PEX search algorithm

- (Optional: Student 3)
- Implement a banded version of Myers' bit-vector algorithm as a replacement for the DP-aligner

- Student 1
- Read in a text and construct the q-gram index (assume a DNA alphabet and q=10)
- Read in a set of patterns (you can assume that all have the same size w)
- For each pattern P
- Count the occurrences of all overlapping q-grams of P in overlapping blocks of the text

- Student 2
- For each pattern P
- For each block that reaches the threshold t (from q-gram lemma)
- Verify if the block contains a semi-global alignment of P with up to k edit-errors
- If so, output the pattern P, its end position in the text and the number of errors

- For each block that reaches the threshold t (from q-gram lemma)

- For each pattern P

- Student 1
- Read in (from command line) a gapped shape, a text length m and a number of mismatches k
- Calculate the exact gapped q-gram threshold t using the DP algorithm described in [1]

- Student 2
- Calculate a lower bound for t using the q-gram lemma
- Calculate an upper bound using a greedy algorithm that tries to destroy as many q-grams as possible with k errors (see [2])
- For different gapped shapes and values for m and k compare the results of Student 1 with your bounds

- Student 1
- Sort (7 radix passes) and name the septuples
*T[i..i+6]*where*i*mod 7 equals 1, 2 or 4 - Create the text
*T'*=*t*_{1}*t*_{2}*t*of septuple-names_{4} - Create suffix array
*A'*of*T'*recursively

- Sort (7 radix passes) and name the septuples
- Student 2
- Derive suffix array
*A*from^{124}*A'* - Construct suffix arrays
*A*,^{0}*A*,^{3}*A*,^{5}*A*iteratively from^{6}*A*^{124}

- Derive suffix array
- Student 3
- Merge suffix arrays from Student 2 using a priority queue and appropriate comparisons

- Student 1
- Implement the Nussinov-SCFG for RNA-folding with the Nussinov CYK algorithm.
- Together with Student 2 compare the folds predicted by your algorithm to those predicted by mfold and RNAFold.

- Student 2
- Implement a training procedure that uses RFAM-alignments as training input to learn the SCFG's parameters that can be used by Student 1.

[2] http://en.wikipedia.org/wiki/Maximum_coverage_problem

This topic: ABI > WebHome > LectureWiki > AdvancedAlgorithms11

Topic revision: 01 May 2013, DavidWeese

Topic revision: 01 May 2013, DavidWeese