ACT IntegratedExample

From GersteinInfo

Revision as of 19:35, 9 September 2010 by Justin.jee (Talk | contribs)
Jump to: navigation, search

ACT Integrated Example: Whole-Genome ChIP-Seq experiments

Here we describe how to use ACT on a set of two large example files from ChIP-Seq experiments.

To begin, download the signal files (from Gerstein Lab's PeakSeq project) here:

These are zipped signal files for PolII and Stat1. After unzipping the files for each transcription factor, it is useful to cat the signal tracks for the individual chromosomes together so that all the signals are in only two files. In this walkthrough they will be referred to as PolII.sgr and Stat1.sgr.

  • Aggregation
  • Correlation

For this example, since we are dealing with signal tracks rather than SNP positions or a bed file of genomic locations, we will use the correlation tool found in corr-sat-bundle to do the correlation calculation. First, it is necessary to convert the signal tracks from sgr to wig format. A script which does this can be found here:

Once we have converted both the PolII and Stat1 sgr files to wig files, we can change the parameters in so that the input files are "PolII.wig" and "Stat1.wig"

In addition, it is important to change config.txt so that the list of genomic regions includes all chromosomes, including mitochondrial sequences. (When the package is first downloaded, all lines except for the one denoting chromosome 22 are commented out).

Once both the and config.txt files have been modified, we can run This will produce an output file with a correlation matrix describing the correlation coefficients between PolII and Stat1 signal.

The resulting correlation matrix can be plotted in R using the heatmap function:


  • Saturation

It is necessary to convert the wig files to bed files without a "signal" component. The following script takes wig files and converts them into bed files, with coordinates representing the regions of the signal track which are above a certain threshold (in this case 0).

Since there are only two signal tracks, the Saturation and Correlation analyses provide overlapping information.

Personal tools