Titel: Integrative analysis of next generation sequencing (NGS) data
Dozenten: Alena van Bömmel, Robert Schöpflin
Maximale Teilnehmerzahl: 9
Ort: Max Planck Institute for Molecular Genetics, Ihnestr. 63-73, PC-Pool
Kurze inhaltliche Beschreibung:
One of the research aims in bioinformatics and regulatory genomics is to understand the principles of transcriptional regulation. The main players of gene regulation are transcription factors (TFs), proteins binding to specific DNA sequences. Another important aspect is the accessibility of chromatin which modulates the binding of transcription factors and RNA Polymerase II to the DNA. There are several experimental techniques based on next generation sequencing which can determine the binding of regulatory proteins to DNA (ChIP-seq), which probe the DNA accessibility (ATAC-seq and DNase-seq) or which measure gene expression (RNA-seq). In our course, we want to investigate the main factors during the differentiation from mouse embryonic stem cells into fibroblasts. Especially, we are interested in understanding the interplay between these factors and their impact on gene expression. To do so, we will analyze ChIP-seq, ATAC-seq and RNA-seq data at different stages of cell differentiation to study TF binding and histone modifications across the genome with respect to the expression of the neighbouring genes. Afterwards, we will develop a statistical method to combine these data sets in an appropriate way. In this course, we will use available tools as well as implement our own scripts to perform the analyses. Students will work in groups of 3 and present their work at the end of the seminar.
Praktische Programmierarbeit: 60 %
Soft Skills: 40 %
Verwendete Programmiersprache(n): R, Python, Shell scripts
Schwierigkeitsgrad (Acht Sterne verteilt auf drei Bereiche):
Unbedingt erforderliche Vorkenntnisse: Statistische Grundlagen, Modul "Algorithmische Bioinformatik"