You are here: Foswiki>ABI Web>ThesesHome>NewHMMLearningAlgorithms (14 Sep 2011, KnutReinert)Edit Attach

MSc Thesis: Faster HMM Learning with Indexing Structures

Motivation
Overview
Work Plan

Motivation

Hidden Markov Models (HMMs) are among the most prominent methods in Bioinformatics. They are routinely applied to DNA sequences, for example for CpG-island or coding sequence detection, and to protein sequences for the important problem of protein domain identification among others. The statistical flexible foundation of HMMs make it the primary choice for many tasks. Given the exponentially increasing amount of sequencing capacity by next and third generation sequencing approaches, their is a high demand for efficient HMM learning algorithms from sequence data. SeqAn offers fast implementation of indexing algorithms, like suffix trees and suffix arrays, which combined with the HMM framework, can satisfy the above mentioned needs.

Overview

This thesis has the following parts.

Implementation of an HMM learning algorithm based on indexing structures.
Comparison with standard learning algorithms which are implemented in SeqAn to a large extend already.

Part (1) should be based on (Lifshits et al. 2009). The following plan consists of 6 months of work.

Work Plan

week 1-2: Understanding the paper by Lifshits et al. 2009 and familiarization with HMMs
week 3-4: Exploration of the current HMM implementation in SeqAn
month2-3: Implementation of the new algorithm in SeqAn
month4 : Performance Experiments on standard application with HMMs and comparison to standard learning algorithms in SeqAn.
month5-6: Cleaning code, Documentation and Interpretation of results and writing the Master thesis.

Topic revision: r1 - 14 Sep 2011, KnutReinert

ABI

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback