UAB day 3: models and tools
Today I am blogging from day 3 of the UAB short course on sequencing.
Speaker 1: Xiangqin Cui – Experimental design
Additive model to correct for length and dinucleotide bias in RNA-seq: Zheng 2011
Poisson model for no biological replicates. For gene i, condition 1 follows Poisson (λi1), condition 2 ~ Poisson (λi2), use Wald test or LRT to compare if λi1 = λi2
Due to overdispersion, Negative Binomial can be used instead (Robinson / Smith).
Speaker 2: David Crossman – Analysis of NGS data using Galaxy
Galaxy is a free GUI-based tool for analyzing next-generation sequence data for people who don’t know command line and don’t know how to code.
Speaker 3: Shili Lin – 3D regulation of the genome
Hypothesis: proteins may promote a gene by binding at a site which is distant in 1D genomic sequence but near to the gene in the 3D organization of the DNA.
The imprinted lgf2/H19 locus is one well-characterized example [Court 2011]
Hi-C protocol by Lieberman-Aiden 2009
- Cross-link DNA with formaldehyde. Pieces of DNA near each other in 3D will get cross linked.
- Cut with restriction enzymes
- Fill ends and mark with biotin
- Ligate strands to each other
- Purify and shear
- Sequence with paired-end to see which pieces of DNA got ligated to one another and therefore were near each other.
- Similar to Hi-C but adds an immunoprecipitation step
- Tumor/normal comparison – does chromatin structure change in cancer
- Global long-range regulation mechanisms
- IFCalculator
- HG - hypergeometric – too many false positives
- Bayesian analysis (BASIC) to filter out random collisons
Speaker 4: Jonas Almeida – cloud computing and the semantic web
Resources for cloud computing
- imagejs.org - cloud-based ImageJ tools
- altjs.org - for those who want to use browsers in order to MapReduce but don’t want to write JavaScript.
Resources for the semantic web