FusionSeq FAQ

From GersteinInfo

Revision as of 18:38, 20 November 2010 by Asboner (Talk | contribs)
Jump to: navigation, search
FusionSeq main web page
User documentation main


General Questions

Does FusionSeq work with my favorite alignment tool?

The format of the paired-end reads that is "understood" by FusionSeq is Mapped Read Format (MRF). We provide several conversion tools from most common alignment programs and formats, including SAM/BAM, to represent mapped reads using MRF. Please take a look at RSEQtools for more information, and specifically to: Format conversion utilities.

Does FusionSeq work with colorspace paired-end reads?

FusionSeq has been developed to be as much independent as possible from the sequencing technology and the alignment tool. However, extensive testing was conducted on Illumina Genome Analyzer II platform only.

Where can I obtain the annotation data for hg19?

Annotation data for hg19 can be found here.

Can I use FusionSeq with my favorite species?

In principle, you can run FusionSeq using any paired-end RNA-Seq data. However, you would need to provide the corresponding data that is currently used for human, i.e.:

  1. a genome sequence, in 2bit format
  2. a gene annotation set in interval format; including composite models of genes
  3. the sequences of the composite models in the gene annotation set
  4. a mapping between your gene annotation and TreeFam (optional, used by gfrLargeScaleHomologyFilter)
  5. a list of the repetitive regions, in interval format (optional, used by gfrRepeatMaskerFilter)
  6. a ribosomal sequence library in 2bit format (optional, used by gfrRibosomalFilter)
  7. the mapping between your gene annotation and other descriptive information, e.g. gene symbols, descriptions, etc. (optional, used by gfrAddInfo)

Where can I find some data sets to test FusionSeq?

Please find some test data sets here.

Is there a demo version of FusionSeq?

A demo version of the web-interface of FusionSeq is available here. You can access the results described in the paper, by typing the sample ID (e.g. 106_T, 1700_D, etc.).

Where can I find more information?

The most up-to-date user documentation for FusionSeq is available here. If you look for the developer's documentation, you can find it here.

How can I cite FusionSeq?

Please cite this publication:

  • Sboner A, Habegger L, Pflueger D, Terry S, Chen DZ, Rozowsky JS, Tewari AK, Kitabayashi N, Moss BJ, Chee MS, Demichelis F, Rubin MA, Gerstein MB. FusionSeq: a modular framework for finding gene fusions by analyzing Paired-End RNA-Sequencing data. Genome Biol 21 Oct. 2010; 11:R104 [1]

Compilation troubleshooting

TROOT.h: No such file or directory

This error occurs because the compiler does not find TROOT.h file. This file is part of ROOT, a framework for mathematical and statistical analysis. If you have installed ROOT, please make sure that you have defined ROOTSYS as the path to the ROOT folder and added it to your PATH:

$ export ROOTSYS=/path/to/ROOT/ 
$ export PATH=$ROOTSYS/bin:$PATH

Please also see Installing and configuring ROOT for more details.

Running issues

FusionSeq does not find the annotation datasets. However, geneFusionConfig.h specifies their correct location and the files are present.

This error:

ls_createFromFile    '$HOME/path/to/data/annotation_data.txt'

occurs because environmental variable, such as $HOME, are not interpreted. Please use full path names in geneFusionConfig.h to specify directory locations.

I followed the instructions, but I still get many WARNINGs. Is this expected?

Yes, every program in FusionSeq provides some logging information. We recommend to capture the log data by redirecting STDERR (e.g. '2> fusionseq.log').

geneFusions: Segmentation Fault

There a number of reasons why one gets this error. One possibility is the lack of the sequences in the MRF file. Although MRF does not require sequences to be valid, sequences are indeed required by geneFusions. Please ensure that sequences are present in the MRF file.

Personal tools