AlleleSeq: Difference between revisions

From GersteinInfo
Jump to navigationJump to search
Public (talk | contribs)
No edit summary
Public (talk | contribs)
No edit summary
Line 9: Line 9:
actual process.
actual process.


-Pre-processing - diploid genome construction using vcf2diploid
1) '''Pre-processing - diploid genome construction using ''vcf2diploid'''''
Assuming that the individual is part of a trio (father-mother-child), the
In the Rozowsky ''et al.'' (2011) paper, the
pre-processing step separate (phase) the child's diploid genome into its parental  
pre-processing step separate (phase) the child's diploid genome into its parental  
haplotypes based on the sequences of the parents. The genotypes of the trio are
haplotypes based on the sequences of the parents.  
then used in the subsequent AlleleSeq pipeline.


-AlleleSeq pipeline - mapping and statistical testing using PIPELINE.mk package
2) '''AlleleSeq pipeline - mapping and statistical testing using PIPELINE.mk package'''
  (a) Reads from ChIP-seq and RNA-seq experiments are aligned and mapped to both  
  (a) Reads from ChIP-seq and RNA-seq experiments are aligned and mapped to both  
haplotype genomes, picking the best match for each read. This is done to eliminate
haplotype genomes.
the reference bias that would exist if we have mapped to the standard human
 
reference genome. 
  (b) Then for each SNV position with mapped reads, we compare the allele  
  (d) Then for each SNV position with mapped reads, we compare the allele  
frequencies observed in the two parental haplotypes.
frequencies observed in the two parental haplotypes. Candidate SNVs showing
allele-specific effects are identified using a statistical framework and by
assigning statistical significance to each SNV.




=vcf2diploid=
=vcf2diploid=
The AlleleSeq pipeline from the Rozowsky ''et al.'' paper requires a pre-processing step. This is the step in which a diploid genome is constructed from the parental sequences, using the PERL script '''''vcf2diploid'''''.






=AlleleSeq pipeline=
=AlleleSeq pipeline=

Revision as of 15:52, 7 June 2013

General outline of pipeline

The basic goal of the pipeline is to take a large collection of reads generated from ChIP-seq or RNA-seq experiments associated with an individual and detect single nucleotide variants (SNVs) that correspond to significantly skewed number of reads. To do this, the pipeline starts with a preprocessing step, before the actual process.

1) Pre-processing - diploid genome construction using vcf2diploid In the Rozowsky et al. (2011) paper, the pre-processing step separate (phase) the child's diploid genome into its parental haplotypes based on the sequences of the parents.

2) AlleleSeq pipeline - mapping and statistical testing using PIPELINE.mk package

(a) Reads from ChIP-seq and RNA-seq experiments are aligned and mapped to both 

haplotype genomes.

(b) Then for each SNV position with mapped reads, we compare the allele 

frequencies observed in the two parental haplotypes.


vcf2diploid

AlleleSeq pipeline