CRIT/workflow

From GersteinInfo

(Difference between revisions)
Jump to: navigation, search
(Example Code)
 
(14 intermediate revisions not shown)
Line 1: Line 1:
==Transcription Factor Example==
==Transcription Factor Example==
-
===Motivation and Problem Set up
+
===Motivation and Problem Set Up===
Cis regulatory elements as a means of regulating gene expression have
Cis regulatory elements as a means of regulating gene expression have
-
been extensively studied.  However, beyond such motifs, are there
+
been extensively studied.  However, beyond such motifs, are their
inherent properties of the targets themselves that make them more or
inherent properties of the targets themselves that make them more or
less likely to be regulated by a given class of transcription factors?
less likely to be regulated by a given class of transcription factors?
Line 10: Line 10:
regulate essential targets?  Are there genome composition features
regulate essential targets?  Are there genome composition features
such as GC or codon bias that influence which targets are regulated by
such as GC or codon bias that influence which targets are regulated by
-
which TFs?  
+
which TFs?
===Input Data===
===Input Data===
Here, we use three different datasets as shown.
Here, we use three different datasets as shown.
-
[[File:schema.png|200px|thumb|left|Data Input Set up]]
+
[[File:schema.png|200 px| thumb | left| Data Input Set up]]
 +
 
 +
These objects are named as follows in the R dataset:
-
These objects are named as follows in the R dataset.
 
(1) T: Transcription factors and their associated properties
(1) T: Transcription factors and their associated properties
Line 23: Line 24:
(3) G: Gene targets and their associated properties
(3) G: Gene targets and their associated properties
 +
 +
T and G are both post processed from:
 +
 +
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein
 +
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.
 +
 +
C is post processed from:
 +
 +
C. T. Harbison, et al. Transcriptional
 +
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.
 +
 +
As in Harbison et al, p<.001 was used to indicate a TF-gene target.  We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.
 +
 +
===Example Code===
 +
 +
 +
<pre>
 +
 +
#The below code snippet needs to be altered to contain the appropriate file path for both the dataset and CRIT package.
 +
#Load Data
 +
load(file="TFExample.RData")
 +
 +
#Load CRIT functions
 +
source(file="CRIT.R")
 +
 +
#Generate label for feature of interest
 +
#Specify x for column variable of interest
 +
tLabel<-initializer(T[,x], type="median")
 +
 +
#Determine set of targets sensitive to this feature
 +
#This step can be quite slow, be patient (on average about 2-3 minutes)
 +
DC<-discriminator(C, tLabel, multCorrect=TRUE)
 +
 +
#Generate new label based on sensitivity identified in previous step
 +
gLabel<-labelSlicer(DC, .05)
 +
 +
#Identify features that seem to discriminate between sens/insens targets
 +
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)
 +
</pre>
 +
 +
===Output===
 +
 +
Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.
 +
 +
Cross patterns can easily be formatted in .sif for loading into various network browsers including
 +
 +
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].
 +
 +
[[File:results.png]]

Latest revision as of 15:50, 21 February 2011

Contents

Transcription Factor Example

Motivation and Problem Set Up

Cis regulatory elements as a means of regulating gene expression have been extensively studied. However, beyond such motifs, are their inherent properties of the targets themselves that make them more or less likely to be regulated by a given class of transcription factors? As an example, do essential transcription factors preferentially regulate essential targets? Are there genome composition features such as GC or codon bias that influence which targets are regulated by which TFs?

Input Data

Here, we use three different datasets as shown.

Data Input Set up

These objects are named as follows in the R dataset:

(1) T: Transcription factors and their associated properties

(2) C: Connector Matrix matching transcription factors to their associated targets

(3) G: Gene targets and their associated properties

T and G are both post processed from:

Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.

C is post processed from:

C. T. Harbison, et al. Transcriptional regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.

As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.

Example Code


#The below code snippet needs to be altered to contain the appropriate file path for both the dataset and CRIT package.
#Load Data
load(file="TFExample.RData")

#Load CRIT functions
source(file="CRIT.R")

#Generate label for feature of interest 
#Specify x for column variable of interest
tLabel<-initializer(T[,x], type="median")

#Determine set of targets sensitive to this feature
#This step can be quite slow, be patient (on average about 2-3 minutes)
DC<-discriminator(C, tLabel, multCorrect=TRUE)

#Generate new label based on sensitivity identified in previous step
gLabel<-labelSlicer(DC, .05)

#Identify features that seem to discriminate between sens/insens targets
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)

Output

Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.

Cross patterns can easily be formatted in .sif for loading into various network browsers including

tYNA or Cytoscape.

File:results.png

Personal tools