General info

|n this part of the Practical course sequence analysis you will be confronted with the situation of prototyping NGS analysis programs with an available C++ library of effiecient data types and algorithms, namly SeqAn.

Date Content Lecturer
05.06 Introduction to SeqAn Knut Reinert, Jochen Singer
12.06 Programming NGS tools in SeqAn Jochen Singer

Day 1 (05.06)

  • General introduction of goals of this unit
  • Assignment 1: Install SeqAn on your computer (follow instructions here)
  • Assignment 2: Work through First steps tutorial with guidance.
  • Assignment 3: Program first app with the command line parser
  • Assignment 4: Work through the sequence IO Tutorial
  • Assignment 5: Adapt your first app, such that it can read fastq files.
  • Assignment 6: Program a simple quality trimming (easy version: just cut a number of bases at the end)
    • (optional): Make you functions template functions (such that they can be reused)
    • (optional): Adapt the trimming function from the trimmer such that all bases from the end are removed which are below a certain threshold.
    • (optional): optional: Adapt the trimming function from the trimmer such that a window of a specified length is shifted from the begin to the end of the read and the average quality of the window is used to trim the read.

Day 2 (12.06)

  • Introduction to adapter removal, read mapping
  • Assignment 1: Program de-multiplexer removal
    • Write a simple de-multiplexer that retrieves all reads with a certain barcode
    • The barcode has to be provided by a file
    • (optional) Use a file with multiple barcodes (only select a read if it is the best match to the specified adaptor)
    • (optional) Use a file with multiple barcodes and create several output files, one for each adaptor
  • Assignment 2: Program a adapter removal tool
    • Sometimes part of the adapter is contaminating a read and therefore has to be removed
    • Write an app (such as the quality trimmer) that reads a read file, removes adaptors from the reads and writes the result to a file.
    • The adapter sequence can either be read from file or taken from the command line.
    • (optional) Allow errors in the adapter sequence.
  • Assignment 3: Work through the index tutorial
  • Assignment 4: Program simple read mapper
    • The app should be based on a seed and extend approach
    • Create the seeds (pigeon-principle)
    • Search for seed with the help of an index
    • Verify the seeds - write a simple function which compares two sequences (or use the globalAlignment() function)
    • (optinal) Implement the verification with Myers verification
    • (optinal) Adjust the range of the verification to take edit distance into consideration
    • (optinal) Try different indices (the app should be based on templates)
    • (optinal) Implement a strategy to optimize the number of verifications.

  • fastq with DNA and DNA5

Solutions

Sources

Topic revision: r8 - 12 Jun 2013, JochenSinger
 
  • Printable version of this topic (p) Printable version of this topic (p)