Page ProgressReportSiragusaSpring2011

This is the progress report of Enrico Siragusa for the Spring 2011.

Accomplishments up to the Fall 2010

Literature

Seeds

Literature on seeds is vaster than I expected, a complete list of papers can be found here. Most important problems/approaches are summarized in my manuscript. I read about related topics, i.e. boolean functions and approximation algorithms.

Research

Modeling

  • Formal model based on boolean functions which models simple seeds and seed families.

Approximate String Matching Framework

  • APX-ratio for the minimum non-detected error.
  • APX-ratio for the complementary threshold.
  • FPRAS for seed sensitivity/specificity values.
  • Heuristic (APX?) for the optimal seed BDD construction.

DNA Homology Framework

  • FPRAS for the Hit-Probability/Expectation.

Goals for the Spring 2011

Literature

Reading

Writing

  • Complete my manuscript.
  • Survey on seeds in sequence analysis? There is already a survey on seeds here, but it is only related to homology search.

Research

  • Can we extend this formal framework to Edit Distance / Indel seeds / Subset seeds?
  • Can we improve logic and engineering of indexing (Cache Oblivious) / filtering (AND) / verification ?
  • Can efficient linear/non-linear programs be formulated for seed design?
  • Can submodularity and monotonicity be used somehow for seed design?
  • Can we construct explicitly quasi-optimal classes of seeds?

Development

  • Benchmark for Approximate String Matching and DNA Homology Frameworks.
  • Sensitivity/Specificity, Hit-Probability/Expectation estimation via FPRAS.
  • DNA Homology Search using Hit-Expectation.
  • ILP for Exact Optimum Threshold Computation.
  • Heuristic Seed Design?

Comments

 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback