PeakSeq

From GersteinInfo

(Difference between revisions)
Jump to: navigation, search
(Software download)
 
(17 intermediate revisions not shown)
Line 1: Line 1:
-
PeakSeq
 
-
PeakSeq: Systematic Scoring of ChIP-Seq Experiments Relative to Controls
+
PeakSeq is a program for identifying and ranking peak regions in ChIP-Seq experiments. It takes as input, mapped reads from a ChIP-Seq experiment, mapped reads from a control experiment and outputs a file with peak regions ranked with increasing Q-values.
-
Rozowsky J, Euskirchen G, Auerbach R, Zhang Z, Gibson T, Bjornson R, Carriero N, Snyder M, Gerstein M
+
-
Chromatin Immunoprecipitation followed by tag sequencing (ChIP-Seq) using high-throughput next-generation instrumentation is replacing ChIP-chip for genome-wide mapping of sites of transcription factor binding and chromatin modification, especially for mammalian genomes. Here we develop a methodology for identifying punctate binding sites in ChIP-Seq experiments based on their characteristics. In particular, we produce two deeply sequenced datasets for human RNA polymerase II and STAT1 with matching input DNA controls. In these sets, we observe that signal peaks, corresponding to sites of potential binding, are strongly correlated with peaks in input DNA that likely reveal features of open chromatin. Based on these observations we develop a two-pass approach for scoring ChIP-Seq data relative to controls. The first pass identifies putative binding sites and compensates for variation in the mappability of sequences across the genome. The second pass filters out sites that are not significantly enriched compared to the normalized input DNA and computes a precise enrichment and significance. Using our scoring approach we investigate the design of an optimal ChIP-Seq experiment. We examine the number of identified binding sites as a function of sequencing depth (i.e. saturation) and the value of multiple replicas. In particular, we find that little additional biological information is gained from more than two replicas.
+
== Citation ==
 +
PeakSeq is described in Rozowsky et al. <i>Nature Biotech</i> 27: 66 ([http://papers.gersteinlab.org/papers/PeakSeq more])
 +
== Software download ==
 +
[http://archive.gersteinlab.org/proj/PeakSeq/Scoring_ChIPSeq/Code/C/PeakSeq_1.31.zip PeakSeq version 1.31]: Bug fix for preprocessing.
-
Mappability Map
+
[http://archive.gersteinlab.org/proj/PeakSeq/Scoring_ChIPSeq/Code/C/PeakSeq.zip PeakSeq version 1.3]: Several bug fixes, cleanup of the command line while PeakSeq runs.
-
[http://archive.gersteinlab.org/proj/PeakSeq/Mappability_Map/Code/ Mappability Map Code]
+
[http://archive.gersteinlab.org/proj/PeakSeq/Scoring_ChIPSeq/Code/C/PeakSeq.v1.1.tar.bz2 PeakSeq version 1.1]: Re-coded from scratch, runs slightly faster and it is easier to use, and supports multiple mapped read file formats. Refer to the README file in the archive for help on running PeakSeq. In addition, [http://archive.gersteinlab.org/proj/PeakSeq/peakseq_v.1.1.ppt these slides] may be helpful.
-
Maps:
+
[http://archive.gersteinlab.org/proj/PeakSeq/Scoring_ChIPSeq/Code/C/PeakSeq_v1.01.tar.gz PeakSeq version 1.01]: Original PeakSeq code.
-
[http://archive.gersteinlab.org/proj/PeakSeq/Mappability_Map/C.elegans/ Map for C. elegans]
+
== Additional Resources Related to PeakSeq ==
-
[http://archive.gersteinlab.org/proj/PeakSeq/Mappability_Map/D.melanogaster/ Map for D. melanogaster]
+
Mapability maps for several organisms can be found [http://archive.gersteinlab.org/proj/PeakSeq/Mappability_Map/ here]
-
[http://archive.gersteinlab.org/proj/PeakSeq/Mappability_Map/H.sapiens/ Map for H. sapiens"]
+
=== Datasets ===
-
[http://archive.gersteinlab.org/proj/PeakSeq/Mappability_Map/M.musculus/ Map for M. musculus]
+
Raw ChIP-Seq Sequence Data: [http://archive.gersteinlab.org/proj/PeakSeq/Sequence_Data/HeLa-S3 PolII], [http://archive.gersteinlab.org/proj/PeakSeq/Sequence_Data/Stimulated_HeLa-S3 STAT-1]
 +
=== Yale Chip-seq Data Pipeline ===
-
ChIP-Seq Scoring
+
[http://array.mbb.yale.edu/pipeline/illumina.html Illumina pipeline] (wrapper for Illumina pipeline)
-
[http://archive.gersteinlab.org/proj/PeakSeq/Scoring_ChIPSeq/Code/Preprocessing/ Preprocessing Code]
+
[http://array.mbb.yale.edu/pipeline/scoring.html Chip-seq scoring pipeline] (runs PeakSeq)
-
[http://archive.gersteinlab.org/proj/PeakSeq/Scoring_ChIPSeq/Code/C/ C Code for ChIP-Seq Scoring]
+
Also, one might want to consider ChipSeqSim : http://papers.gersteinlab.org/papers/chip-seq-simu/
-
[http://archive.gersteinlab.org/proj/PeakSeq/Scoring_ChIPSeq/Code/Perl/ Perl Code for ChIP-Seq Scoring]
+
== Contact ==
-
 
+
Joel Rozowsky at  joel DOT rozowsky AT yale DOT edu or Arif Harmanci at  Arif DOT Harmanci AT yale DOT edu.
-
 
+
-
Scored results:
+
-
 
+
-
[http://archive.gersteinlab.org/proj/PeakSeq/Scoring_ChIPSeq/Results/PolII Pol II Results]
+
-
 
+
-
[http://archive.gersteinlab.org/proj/PeakSeq/Scoring_ChIPSeq/Results/STAT1 STAT1 Results]
+
-
 
+
-
 
+
-
Raw ChIP-Seq Sequence Data:
+
-
 
+
-
[http://archive.gersteinlab.org/proj/PeakSeq/Sequence_Data/HeLa-S3 Pol II Sequence Data]
+
-
 
+
-
[http://archive.gersteinlab.org/proj/PeakSeq/Sequence_Data/Stimulated_HeLa-S3 STAT1 Sequence Data]
+
-
 
+
-
 
+
-
Contact Joel Rozowsky at  joel DOT rozowsky AT yale DOT edu.
+

Latest revision as of 17:21, 2 March 2015

PeakSeq is a program for identifying and ranking peak regions in ChIP-Seq experiments. It takes as input, mapped reads from a ChIP-Seq experiment, mapped reads from a control experiment and outputs a file with peak regions ranked with increasing Q-values.

Contents

Citation

PeakSeq is described in Rozowsky et al. Nature Biotech 27: 66 (more)

Software download

PeakSeq version 1.31: Bug fix for preprocessing.

PeakSeq version 1.3: Several bug fixes, cleanup of the command line while PeakSeq runs.

PeakSeq version 1.1: Re-coded from scratch, runs slightly faster and it is easier to use, and supports multiple mapped read file formats. Refer to the README file in the archive for help on running PeakSeq. In addition, these slides may be helpful.

PeakSeq version 1.01: Original PeakSeq code.

Additional Resources Related to PeakSeq

Mapability maps for several organisms can be found here

Datasets

Raw ChIP-Seq Sequence Data: PolII, STAT-1

Yale Chip-seq Data Pipeline

Illumina pipeline (wrapper for Illumina pipeline)

Chip-seq scoring pipeline (runs PeakSeq)

Also, one might want to consider ChipSeqSim : http://papers.gersteinlab.org/papers/chip-seq-simu/

Contact

Joel Rozowsky at joel DOT rozowsky AT yale DOT edu or Arif Harmanci at Arif DOT Harmanci AT yale DOT edu.

Personal tools