CRIT/code

2011-01-30T22:44:59Z

Tara:

CRIT/code

2011-01-30T21:39:26Z

Tara: /* Source code */

CRIT/workflow

2011-01-30T21:21:15Z

Tara: /* Output */

==Transcription Factor Example==

===Motivation and Problem Set Up===

Cis regulatory elements as a means of regulating gene expression have
been extensively studied. However, beyond such motifs, are there
inherent properties of the targets themselves that make them more or
less likely to be regulated by a given class of transcription factors?
As an example, do essential transcription factors preferentially
regulate essential targets? Are there genome composition features
such as GC or codon bias that influence which targets are regulated by
which TFs?

===Input Data===
Here, we use three different datasets as shown.

[[File:schema.png|200 px| thumb | left| Data Input Set up]]

These objects are named as follows in the R dataset:

(1) T: Transcription factors and their associated properties

(2) C: Connector Matrix matching transcription factors to their associated targets

(3) G: Gene targets and their associated properties

T and G are both post processed from:

Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.

C is post processed from:

C. T. Harbison, et al. Transcriptional
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.

As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.

===Example Code===

<pre>

#Load Data
load(file="TFExample.RData")

#Load CRIT functions
source(file="CRIT.R")

#Generate label for feature of interest - set x for column variable
tLabel<-initializer(T[,x], type="median")

#Determine set of targets sensitive to this feature
DC<-discriminator(C, tLabel, multCorrect=TRUE)

#Generate new label based on sensitivity identified in previous step
gLabel<-labelSlicer(DC, .05)

#Identify features that seem to discriminate between sens/insens targets
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)
</pre>

===Output===

Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.

Cross patterns can easily be formatted in .sif for loading into various network browsers including

[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].

[[File:results.png]]

2011-01-29T17:30:52Z

Tara: /* Input Data */

CRIT/workflow

2011-01-29T16:59:18Z

Tara: /* Input Data */

CRIT/workflow

2011-01-29T16:58:38Z

Tara:

CRIT

2011-01-29T16:38:40Z

Tara: /* Cross Pattern Identification Technique (CRIT) */

=Cross Pattern Identification Technique (CRIT)=

==Algorithm Overview==

Label, Slice, Discriminate, Repeat.

[[File:alg.png]]

==Core Functions and Their Parameters==

===Initializer===
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.

Required Arguments: Column of A, type of partitioning

Output: Vector assigning a label to every ROW of Column of A

<pre>
initializer<-function(A, type=c("median","mean")) {
#Create empty vector with the same number of rows of A
label<-matrix(0, length(A))
#Get value to threshold off of
t<-getThreshold(A, type=type)
#Create label
label<-labelSlicer(A ,t)
return (label)
}

</pre>

===Labeler===
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.

Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)

Required Arguments: Matrix (B), vector assigning a label to every ROW of A

Output: Vector assigning a label to every COLUMN of B

===Slicer===
Slicer: Partition rows of N into "slices" based on labels from A

Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B

Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)

<pre>
labelSlicer<-function(values, t){
#Set those strictly greater than threshold to 1
index<-which(values>=t)
label(index)<-1
return (label)
}
</pre>

===Discriminator===
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).

Required Arguments: Matrix (B), label

Optional Arguments: set to TRUE to compute the FDR

Output: Value of test for every column of B

<pre>
#In practice this is implemented as a index
#Additional discriminator functions that use KS test, hypergeometric
#distribution, etc are possible
discriminator<-function(mat=B, label=l, multCorrect=FALSE){

#Create empty list
ttest_struct<-list();

#Compute value of test for each row
for(i in 1:nrow(X)){
temp<-t.test(as.numeric(B[i,])~label)
ttest_struct[[i]]=temp;
}

#Extract pvalue from structure
all_pval<-getPvalfromStruct(ttest_struct);

#Test if we want to compute FDR
if (multCorrect) { all_pval<-p.adjust(all_pval, method="fdr") }

return (all_pval)
}

</pre>

==AUXILLIARY FUNCTIONS==

===Checking against Threshold===
<pre>
Returns a value to split on for the base case
getThreshold<-function(A, type=c("median","mean")) {
type <- match.arg(type)
threshold <- switch(type,
mean = mean(A),
median = median(A)
)
return (threshold)
}

</pre>

===Return t-test Object===
<pre>
Returns only the t-test from the ttest object
getTfromStruct<-function(struct){
p<-unlist(lapply(struct, "[",1));
return(p)
}
</pre>

===Return p Value Object===
<pre>
Returns only the pvalues from the ttest object
getPvalfromStruct<-function(struct){
p<-unlist(lapply(struct, "[",3));
return(p)
}
</pre>

CRIT

2011-01-29T16:38:13Z

Tara: /* Cross Pattern Identification Technique (CRIT) */

=Cross Pattern Identification Technique (CRIT)=

Label, Slice, Discriminate, Repeat.

[[File:alg.png]]

==Core Functions and Their Parameters==

===Initializer===
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.

Required Arguments: Column of A, type of partitioning

Output: Vector assigning a label to every ROW of Column of A

<pre>
initializer<-function(A, type=c("median","mean")) {
#Create empty vector with the same number of rows of A
label<-matrix(0, length(A))
#Get value to threshold off of
t<-getThreshold(A, type=type)
#Create label
label<-labelSlicer(A ,t)
return (label)
}

</pre>

===Labeler===
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.

Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)

Required Arguments: Matrix (B), vector assigning a label to every ROW of A

Output: Vector assigning a label to every COLUMN of B

===Slicer===
Slicer: Partition rows of N into "slices" based on labels from A

Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B

Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)

<pre>
labelSlicer<-function(values, t){
#Set those strictly greater than threshold to 1
index<-which(values>=t)
label(index)<-1
return (label)
}
</pre>

===Discriminator===
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).

Required Arguments: Matrix (B), label

Optional Arguments: set to TRUE to compute the FDR

Output: Value of test for every column of B

<pre>
#In practice this is implemented as a index
#Additional discriminator functions that use KS test, hypergeometric
#distribution, etc are possible
discriminator<-function(mat=B, label=l, multCorrect=FALSE){

#Create empty list
ttest_struct<-list();

#Compute value of test for each row
for(i in 1:nrow(X)){
temp<-t.test(as.numeric(B[i,])~label)
ttest_struct[[i]]=temp;
}

#Extract pvalue from structure
all_pval<-getPvalfromStruct(ttest_struct);

#Test if we want to compute FDR
if (multCorrect) { all_pval<-p.adjust(all_pval, method="fdr") }

return (all_pval)
}

</pre>

==AUXILLIARY FUNCTIONS==

===Checking against Threshold===
<pre>
Returns a value to split on for the base case
getThreshold<-function(A, type=c("median","mean")) {
type <- match.arg(type)
threshold <- switch(type,
mean = mean(A),
median = median(A)
)
return (threshold)
}

</pre>

===Return t-test Object===
<pre>
Returns only the t-test from the ttest object
getTfromStruct<-function(struct){
p<-unlist(lapply(struct, "[",1));
return(p)
}
</pre>

===Return p Value Object===
<pre>
Returns only the pvalues from the ttest object
getPvalfromStruct<-function(struct){
p<-unlist(lapply(struct, "[",3));
return(p)
}
</pre>

CRIT

2011-01-29T16:37:47Z

Tara: /* Cross Pattern Identification Technique (CRIT) */

=Cross Pattern Identification Technique (CRIT)=

Label, Slice, Discriminate, Repeat.

[[File:alg.png|200px|thumb]]

==Core Functions and Their Parameters==

===Initializer===
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.

Required Arguments: Column of A, type of partitioning

Output: Vector assigning a label to every ROW of Column of A

<pre>
initializer<-function(A, type=c("median","mean")) {
#Create empty vector with the same number of rows of A
label<-matrix(0, length(A))
#Get value to threshold off of
t<-getThreshold(A, type=type)
#Create label
label<-labelSlicer(A ,t)
return (label)
}

</pre>

===Labeler===
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.

Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)

Required Arguments: Matrix (B), vector assigning a label to every ROW of A

Output: Vector assigning a label to every COLUMN of B

===Slicer===
Slicer: Partition rows of N into "slices" based on labels from A

Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B

Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)

<pre>
labelSlicer<-function(values, t){
#Set those strictly greater than threshold to 1
index<-which(values>=t)
label(index)<-1
return (label)
}
</pre>

===Discriminator===
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).

Required Arguments: Matrix (B), label

Optional Arguments: set to TRUE to compute the FDR

Output: Value of test for every column of B

<pre>
#In practice this is implemented as a index
#Additional discriminator functions that use KS test, hypergeometric
#distribution, etc are possible
discriminator<-function(mat=B, label=l, multCorrect=FALSE){

#Create empty list
ttest_struct<-list();

#Compute value of test for each row
for(i in 1:nrow(X)){
temp<-t.test(as.numeric(B[i,])~label)
ttest_struct[[i]]=temp;
}

#Extract pvalue from structure
all_pval<-getPvalfromStruct(ttest_struct);

#Test if we want to compute FDR
if (multCorrect) { all_pval<-p.adjust(all_pval, method="fdr") }

return (all_pval)
}

</pre>

==AUXILLIARY FUNCTIONS==

===Checking against Threshold===
<pre>
Returns a value to split on for the base case
getThreshold<-function(A, type=c("median","mean")) {
type <- match.arg(type)
threshold <- switch(type,
mean = mean(A),
median = median(A)
)
return (threshold)
}

</pre>

===Return t-test Object===
<pre>
Returns only the t-test from the ttest object
getTfromStruct<-function(struct){
p<-unlist(lapply(struct, "[",1));
return(p)
}
</pre>

===Return p Value Object===
<pre>
Returns only the pvalues from the ttest object
getPvalfromStruct<-function(struct){
p<-unlist(lapply(struct, "[",3));
return(p)
}
</pre>

File:Alg.png

2011-01-29T16:37:00Z

Tara:

2011-01-29T14:33:45Z

Tara:

CRIT/workflow

2011-01-28T21:10:15Z

Tara:

==Transcription Factor Example==

===Input Data===

2011-01-24T19:12:12Z

Tara: Created page with '==Coming Soon!=='

==Coming Soon!==

CRIT/code

2011-01-24T19:10:06Z

Tara: Created page with '<center>[http://archive.gersteinlab.org/proj/crit/ '''CRIT Main Page''']</center> __NOTOC__ == Code == === Required Software - External === # This is an R package. <br> ===…'