FusionSeq Test Datasets

From GersteinInfo

(Difference between revisions)
Jump to: navigation, search
m
(BAM)
Line 15: Line 15:
==BAM==
==BAM==
[http://samtools.sourceforge.net/ BAM] format is the binary compressed format of [http://samtools.sourceforge.net/ SAM (Sequence Alignment/Map)]. We provide both BAM files and their corresponding index files (*.bai) so that they can be viewed with the [http://www.broadinstitute.org/igv/ Integrative Genome Viewer (IGV)] a high-performance visualization tool for interactive exploration of large, integrated datasets from the Broad Institute.
[http://samtools.sourceforge.net/ BAM] format is the binary compressed format of [http://samtools.sourceforge.net/ SAM (Sequence Alignment/Map)]. We provide both BAM files and their corresponding index files (*.bai) so that they can be viewed with the [http://www.broadinstitute.org/igv/ Integrative Genome Viewer (IGV)] a high-performance visualization tool for interactive exploration of large, integrated datasets from the Broad Institute.
-
* http://rnaseq.gersteinlab.org/fusionseq/datasets/GM12878.bam
+
* [http://rnaseq.gersteinlab.org/fusionseq/datasets/GM12878.bam http://rnaseq.gersteinlab.org/fusionseq/datasets/GM12878.bam]
-
* http://rnaseq.gersteinlab.org/fusionseq/datasets/GM12878.bam.bai  
+
* [http://rnaseq.gersteinlab.org/fusionseq/datasets/GM12878.bam.bai http://rnaseq.gersteinlab.org/fusionseq/datasets/GM12878.bam.bai]
-
* http://rnaseq.gersteinlab.org/fusionseq/datasets/NCIH660.bam  
+
* [http://rnaseq.gersteinlab.org/fusionseq/datasets/NCIH660.bam http://rnaseq.gersteinlab.org/fusionseq/datasets/NCIH660.bam]
-
* http://rnaseq.gersteinlab.org/fusionseq/datasets/NCIH660.bam.bai  
+
* [http://rnaseq.gersteinlab.org/fusionseq/datasets/NCIH660.bam.bai http://rnaseq.gersteinlab.org/fusionseq/datasets/NCIH660.bam.bai]
You can download the files locally or load them into IGV directly. See instructions at http://www.broadinstitute.org/igv/.
You can download the files locally or load them into IGV directly. See instructions at http://www.broadinstitute.org/igv/.

Revision as of 22:11, 23 October 2010

FusionSeq main web page
User documentation main

Two datasets are available to test FusionSeq: NCIH660 and GM12878 cell-line data. These datasets are part of FusionSeq dataset, published in Genome Biology, 2010;11:R104. Please note that the full set, including cancer samples, will be available at dbGaP soon, where confidentiality issues are taken care of properly. We here provide the cell-line data in different formats:

Mapped Read Format (MRF)

This is the format required by FusionSeq. RSEQtools provide several conversion tools to generate MRF files from the most popular alignment tools.

Please read 'How to execute FusionSeq' section for more detail on how to use these files.

FASTQ

FASTQ is a text-based format for storing both a biological sequence and its corresponding quality scores. Each tarball includes two FASTQ files, one for each end.

BAM

BAM format is the binary compressed format of SAM (Sequence Alignment/Map). We provide both BAM files and their corresponding index files (*.bai) so that they can be viewed with the Integrative Genome Viewer (IGV) a high-performance visualization tool for interactive exploration of large, integrated datasets from the Broad Institute.

You can download the files locally or load them into IGV directly. See instructions at http://www.broadinstitute.org/igv/.

Personal tools