Page Reanotation of the genome of Carsonella Ruddii using non-collinear methods

Description will follow shortly

Layout of project

Project progress

Identification of the species set

  • I will only used genomes from gammaproteobacteria. The set of species will be extended later

  • For validation purposes I will analyze the species used in Moya et al. paper from 2007:
    • U00096.2 (E.Coli)
    • Four different str. of Buchnera aphidicola:
      • BA000003
      • AE013218
      • AE016826
      • CP000263
    • In the original paper the authors do not consider any plasmids of the species.
  • Set of species I have so far (all sequences were downloaded on the 01.06.10):
    • Carsonella Ruddii PV (160 kb genome, 213 genes) (grammaproteobacteria) (AP009180.1)
    • Buchnera aphidicola BCc (Cc) (+ a plasmid) : 450 kb. (grammaproteobacteria) (CP000263.1)
    • Candidatus Blochmannia floridanus: 705 kb. (631 genes). (grammaproteobacteria) (BX248583.1)
    • Wigglesworthia glossinidia (+ a plasmid): 698 kb. (651 genes) (gammaproteobacteria) (BA000021.3)
    • Baumannia cicadellinicola str. Hc: 686 kb (651 genes) (grammaproteobacteria) (CP000238.1)

  • To identify the phylogenetic tree of these species I intend to use the 16S-rRNA sequence.

Results of 16S-rRNA comparison

  • I've extracted the 16S-rRNA sequence from all species. Then:
  • I will use the ML-tree to guide the building of a progressive alignment (e.g. in S-LAGAN)
REINERT: Yes. Use S-Lagan or SuperMap. Use a working program do not spend to much time making others work. What are the next intended steps? Think about one manageable outcome. You can certainly not re-evalutate the complete annotation.

Next steps (deutsch)

  • Parsen von der aktuell annotierten Version des Carsonella Genoms und des ABA Ergebnisses.
  • Statistische Auswertung des Ergebnisses mit R
    • Finde ich dieselben Gene. Finde ich andere/neue Gene. Ist die Anzahl der gefundenen Gene signifikant anderes.

Problems and Questions

  • Identification of a proper set of species appears to be not quite simple. C.Ruddii is classified as unclassified Gammaproteobacteria. I was told, that "normally" one can define a phylogenetic tree using the sequence of 16S rRNA, which has to be present in all bacteria in order for them to exist. (todo: check for a publication)
  • Can anyone name me a alignment program that uses ABA (A-Bruijn Alignment)? I discovered AliWABA, but the webservice they provide is not available (http://aba.nbcr.net/)
BIRTE: The authors provide an implementation for download. The link is given on page 2 of the paper: http://nbcr.sdsc.edu/euler/

IVAN: Thx. I've downloaded and installed it successfully.

  • SuperMap:
    • What is the CHAOS format; CHAins Of Seeds ?
    • What format does the scoring file need?
    • Supermap needs some GPDB config file… Genome Profile DataBase. Their website is sort of down right now

  • ABA Problems:
    • I do understand the output now, but it is still strange.
    • Here (out1.pdf) and Here (out2.ps) is the output of the example run of two small sequences (chloroplasts of two plants). In the nodes are the positions in each of the relevant sequence. The problem is that I only have two sequences here and in the output there are numbers ranging from 0 to 4…
    • I don't understand the color of some edges => why are some edges colored? Do they represent a strongly supported "path"?

Comments

 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback