Data Requirements

From GersteinInfo

Jump to: navigation, search
FusionSeq main web page
User documentation main

Here is the list of required data for a comprehensive use of FusionSeq tools.

External

The human genome needs to be properly indexed to be used by bowtie. Please see the instruction of bowtie for performing this operation. Indicatevely, you would need to run something like:

$ bowtie-build -f hg18_nh.fa /path2bowtieIndex/hg18_nh/

where hg18_nh.fa corresponds to the concatenation of all human chromosomes from chromFa.zip without the different haplotypes and "random" stuff.

Provided

The following data sets, bundled in a tarball, can be downloaded here.

  • knownGeneAnnotation.txt
  • knownGeneAnnotationTranscriptCompositeModel.txt
  • knownGeneAnnotationTranscriptCompositeModel.fa
  • kgXref.txt
  • knownToTreefam.txt

The composite model needs to be indexed by bowtie:

$ bowtie-build -f knownGeneAnnotationTranscriptCompositeModel.fa /path2bowtieIndex/hg18_knownGeneAnnotationTranscriptCompositeModel/hg18_knownGeneAnnotationTranscriptCompositeModel

Please make sure that the correct filenames are used.

Personal tools