Revision as of 15:52, 7 June 2013

General outline of pipeline

The basic goal of the pipeline is to take a large collection of reads generated from ChIP-seq or RNA-seq experiments associated with an individual and detect single nucleotide variants (SNVs) that correspond to significantly skewed number of reads. To do this, the pipeline starts with a preprocessing step, before the actual process.

1) Pre-processing - diploid genome construction using vcf2diploid In the Rozowsky et al. (2011) paper, the pre-processing step separate (phase) the child's diploid genome into its parental haplotypes based on the sequences of the parents.

2) AlleleSeq pipeline - mapping and statistical testing using PIPELINE.mk package

(a) Reads from ChIP-seq and RNA-seq experiments are aligned and mapped to both

haplotype genomes.

(b) Then for each SNV position with mapped reads, we compare the allele

frequencies observed in the two parental haplotypes.

@@ Line 9: / Line 9: @@
 actual process.
--Pre-processing - diploid genome construction using vcf2diploid
+) '''Pre-processing - diploid genome construction using ''vcf2diploid'''''
-Assuming that the individual is part of a trio (father-mother-child), the
+In the Rozowsky ''et al.'' (2011) paper, the
 pre-processing step separate (phase) the child's diploid genome into its parental
-haplotypes based on the sequences of the parents. The genotypes of the trio are
+haplotypes based on the sequences of the parents.
-then used in the subsequent AlleleSeq pipeline.
--AlleleSeq pipeline - mapping and statistical testing using PIPELINE.mk package
+) '''AlleleSeq pipeline - mapping and statistical testing using PIPELINE.mk package'''
   (a) Reads from ChIP-seq and RNA-seq experiments are aligned and mapped to both
-haplotype genomes, picking the best match for each read. This is done to eliminate
+haplotype genomes.
-the reference bias that would exist if we have mapped to the standard human
-reference genome.
+  (b) Then for each SNV position with mapped reads, we compare the allele
-  (d) Then for each SNV position with mapped reads, we compare the allele
+frequencies observed in the two parental haplotypes.
-frequencies observed in the two parental haplotypes. Candidate SNVs showing
-allele-specific effects are identified using a statistical framework and by
-assigning statistical significance to each SNV.
 =vcf2diploid=
-The AlleleSeq pipeline from the Rozowsky ''et al.'' paper requires a pre-processing step. This is the step in which a diploid genome is constructed from the parental sequences, using the PERL script '''''vcf2diploid'''''.
 =AlleleSeq pipeline=

AlleleSeq: Difference between revisions

Revision as of 15:52, 7 June 2013

Contents

General outline of pipeline

vcf2diploid

AlleleSeq pipeline

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools