VAT/dataSets: Difference between revisions

From GersteinInfo
Jump to navigationJump to search
No edit summary
No edit summary
Line 9: Line 9:
<center>[[#top|Top]]</center>
<center>[[#top|Top]]</center>


==== Low coverage samples from the 1000 Genomes Pilot Project ====
==== 1000 Genomes Pilot Project: Low coverage samples ====


  - Data files:
  - Data files:
Line 30: Line 30:
<center>[[#top|Top]]</center>
<center>[[#top|Top]]</center>


==== Low coverage samples from the 1000 Genomes Pilot Project ====
==== Main 1000 Genomes Project, Phase I, SNP calls ====


  - Data files:
  - Data files: release: 20100804, FTP: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/
 
    - Indels
        - ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/indels/CEU.low_coverage.2010_07.indel.genotypes.vcf.gz
        - ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/indels/JPTCHB.low_coverage.2010_07.indel.genotypes.vcf.gz
        - ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/indels/YRI.low_coverage.2010_07.indel.genotypes.vcf.gz
     - SNPs
     - SNPs
         - ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/CEU.low_coverage.2010_07.genotypes.vcf.gz
         - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz ALL.2of4intersection.20100804.genotypes.vcf.gz]
        - ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/CHBJPT.low_coverage.2010_07.genotypes.vcf.gz
        - ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/YRI.low_coverage.2010_07.genotypes.vcf.gz
  - Annotation file
  - Annotation file
     - [ftp://ftp.sanger.ac.uk/pub/gencode/release_3b/gencode.v3b.annotation.NCBI36.gtf.gz GENCODE (version 3b, hg18)] using CDS elements where ''gene_type = protein_coding'' and ''transcript_type = protein_coding''
     - [ftp://ftp.sanger.ac.uk/pub/gencode/release_3c/gencode.v3c.annotation.GRCh37.gtf.gz GENCODE (version 3c, hg19)] using CDS elements where ''gene_type = protein_coding'' and ''transcript_type = protein_coding''
  - Results
  - Results
     - [http://dynamic.gersteinlab.org/people/lh372/dev/vat_cgi?mode=process&dataSet=1000genomes_lowCoverage VAT]
     - [http://dynamic.gersteinlab.org/people/lh372/dev/vat_cgi?mode=process&dataSet=ALL.2of4intersection.20100804 VAT]

Revision as of 18:24, 8 March 2011

VAT Main Page

Data sets

1000 Genomes Project

Top

1000 Genomes Pilot Project: Low coverage samples

- Data files:
    - Source: pilot_data, release: 2010_07, FTP:  ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/
    - Indels
        - CEU.low_coverage.2010_07.indel.genotypes.vcf.gz
        - JPTCHB.low_coverage.2010_07.indel.genotypes.vcf.gz
        - YRI.low_coverage.2010_07.indel.genotypes.vcf.gz
    - SNPs
        - CEU.low_coverage.2010_07.genotypes.vcf.gz
        - CHBJPT.low_coverage.2010_07.genotypes.vcf.gz
        - YRI.low_coverage.2010_07.genotypes.vcf.gz
- Annotation file
    - GENCODE (version 3b, hg18) using CDS elements where gene_type = protein_coding and transcript_type = protein_coding
- Results
    - VAT


Top

Main 1000 Genomes Project, Phase I, SNP calls

- Data files: release: 20100804, FTP: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/
    - SNPs
        - ALL.2of4intersection.20100804.genotypes.vcf.gz
- Annotation file
    - GENCODE (version 3c, hg19) using CDS elements where gene_type = protein_coding and transcript_type = protein_coding
- Results
    - VAT