VAT/dataSets

From GersteinInfo

(Difference between revisions)
Jump to: navigation, search
(Main 1000 Genomes Project, Phase I, SNP calls)
 
(8 intermediate revisions not shown)
Line 4: Line 4:
== Data sets ==
== Data sets ==
-
 
-
=== 1000 Genomes Project ===
 
<center>[[#top|Top]]</center>
<center>[[#top|Top]]</center>
-
==== 1000 Genomes Pilot Project: Low coverage samples ====
+
=== 1000 Genomes Pilot Project: Low coverage samples ===
-
  - Data files:
+
  - Data files
     - Source: pilot_data, release: 2010_07, FTP:  ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/
     - Source: pilot_data, release: 2010_07, FTP:  ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/
     - Indels
     - Indels
Line 21: Line 19:
         - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/CHBJPT.low_coverage.2010_07.genotypes.vcf.gz CHBJPT.low_coverage.2010_07.genotypes.vcf.gz]
         - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/CHBJPT.low_coverage.2010_07.genotypes.vcf.gz CHBJPT.low_coverage.2010_07.genotypes.vcf.gz]
         - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/YRI.low_coverage.2010_07.genotypes.vcf.gz YRI.low_coverage.2010_07.genotypes.vcf.gz]
         - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/YRI.low_coverage.2010_07.genotypes.vcf.gz YRI.low_coverage.2010_07.genotypes.vcf.gz]
-
  - Annotation file
+
  - Annotation file: [ftp://ftp.sanger.ac.uk/pub/gencode/release_3b/gencode.v3b.annotation.NCBI36.gtf.gz GENCODE (version 3b, hg18)] using CDS elements where ''gene_type = protein_coding'' and ''transcript_type = protein_coding''
-
    - [ftp://ftp.sanger.ac.uk/pub/gencode/release_3b/gencode.v3b.annotation.NCBI36.gtf.gz GENCODE (version 3b, hg18)] using CDS elements where ''gene_type = protein_coding'' and ''transcript_type = protein_coding''
+
  - Results: [http://dynamic.gersteinlab.org/people/lh372/vat_cgi?mode=process&dataSet=1000genomes_lowCoverage&annotationSet=gencode3b&type=coding VAT]
-
  - Results
+
-
    - [http://dynamic.gersteinlab.org/people/lh372/dev/vat_cgi?mode=process&dataSet=1000genomes_lowCoverage VAT]
+
<br>
<br>
Line 30: Line 26:
<center>[[#top|Top]]</center>
<center>[[#top|Top]]</center>
-
==== 1000 Genomes Project, Phase I, SNP calls ====
+
=== 1000 Genomes Project, Phase I, chr22, SNP calls ===
-
  - Data files: release: 20100804, FTP: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/
+
  - Data files
-
     - SNPs
+
    - Source: release: 20100804, FTP: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/
-
        - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz ALL.2of4intersection.20100804.genotypes.vcf.gz]
+
     - SNPs: [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz ALL.2of4intersection.20100804.genotypes.vcf.gz]
-
  - Annotation file
+
  - Annotation file: [ftp://ftp.sanger.ac.uk/pub/gencode/release_3c/gencode.v3c.annotation.GRCh37.gtf.gz GENCODE (version 3c, hg19)] using CDS elements where ''gene_type = protein_coding'' and ''transcript_type = protein_coding''
-
    - [ftp://ftp.sanger.ac.uk/pub/gencode/release_3c/gencode.v3c.annotation.GRCh37.gtf.gz GENCODE (version 3c, hg19)] using CDS elements where ''gene_type = protein_coding'' and ''transcript_type = protein_coding''
+
  - Results: [http://dynamic.gersteinlab.org/people/lh372/vat_cgi?mode=process&dataSet=ALL.2of4intersection.20100804.chr22&annotationSet=gencode3c&type=coding VAT]
-
  - Results
+
- [http://info.gersteinlab.org/VAT#Example_workflow Detailed workflow]
-
    - [http://dynamic.gersteinlab.org/people/lh372/dev/vat_cgi?mode=process&dataSet=ALL.2of4intersection.20100804 VAT]
+
 
 +
<br>
 +
 
 +
<center>[[#top|Top]]</center>
 +
 
 +
== Pre-processed GENCODE annotation sets ==
 +
 
 +
The pre-processed GENCODE annotation sets can be downloaded [http://info.gersteinlab.org/VAT/download#Download_of_pre-processed_annotation_sets here].

Latest revision as of 18:25, 14 June 2011

VAT Main Page

Contents


Data sets

Top

1000 Genomes Pilot Project: Low coverage samples

- Data files
    - Source: pilot_data, release: 2010_07, FTP:  ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/
    - Indels
        - CEU.low_coverage.2010_07.indel.genotypes.vcf.gz
        - JPTCHB.low_coverage.2010_07.indel.genotypes.vcf.gz
        - YRI.low_coverage.2010_07.indel.genotypes.vcf.gz
    - SNPs
        - CEU.low_coverage.2010_07.genotypes.vcf.gz
        - CHBJPT.low_coverage.2010_07.genotypes.vcf.gz
        - YRI.low_coverage.2010_07.genotypes.vcf.gz
- Annotation file: GENCODE (version 3b, hg18) using CDS elements where gene_type = protein_coding and transcript_type = protein_coding
- Results: VAT


Top

1000 Genomes Project, Phase I, chr22, SNP calls

- Data files
    - Source: release: 20100804, FTP: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/
    - SNPs: ALL.2of4intersection.20100804.genotypes.vcf.gz
- Annotation file: GENCODE (version 3c, hg19) using CDS elements where gene_type = protein_coding and transcript_type = protein_coding
- Results: VAT
- Detailed workflow


Top

Pre-processed GENCODE annotation sets

The pre-processed GENCODE annotation sets can be downloaded here.

Personal tools