VAT/dataSets
From GersteinInfo
(Difference between revisions)
(16 intermediate revisions not shown) | |||
Line 3: | Line 3: | ||
__TOC__ | __TOC__ | ||
- | == | + | == Data sets == |
<center>[[#top|Top]]</center> | <center>[[#top|Top]]</center> | ||
- | === 1000 Genomes Project === | + | === 1000 Genomes Pilot Project: Low coverage samples === |
- | + | - Data files | |
+ | - Source: pilot_data, release: 2010_07, FTP: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/ | ||
+ | - Indels | ||
+ | - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/indels/CEU.low_coverage.2010_07.indel.genotypes.vcf.gz CEU.low_coverage.2010_07.indel.genotypes.vcf.gz] | ||
+ | - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/indels/JPTCHB.low_coverage.2010_07.indel.genotypes.vcf.gz JPTCHB.low_coverage.2010_07.indel.genotypes.vcf.gz] | ||
+ | - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/indels/YRI.low_coverage.2010_07.indel.genotypes.vcf.gz YRI.low_coverage.2010_07.indel.genotypes.vcf.gz] | ||
+ | - SNPs | ||
+ | - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/CEU.low_coverage.2010_07.genotypes.vcf.gz CEU.low_coverage.2010_07.genotypes.vcf.gz] | ||
+ | - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/CHBJPT.low_coverage.2010_07.genotypes.vcf.gz CHBJPT.low_coverage.2010_07.genotypes.vcf.gz] | ||
+ | - [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/snps/YRI.low_coverage.2010_07.genotypes.vcf.gz YRI.low_coverage.2010_07.genotypes.vcf.gz] | ||
+ | - Annotation file: [ftp://ftp.sanger.ac.uk/pub/gencode/release_3b/gencode.v3b.annotation.NCBI36.gtf.gz GENCODE (version 3b, hg18)] using CDS elements where ''gene_type = protein_coding'' and ''transcript_type = protein_coding'' | ||
+ | - Results: [http://dynamic.gersteinlab.org/people/lh372/vat_cgi?mode=process&dataSet=1000genomes_lowCoverage&annotationSet=gencode3b&type=coding VAT] | ||
- | + | <br> | |
- | Data files | + | <center>[[#top|Top]]</center> |
- | + | ||
- | + | === 1000 Genomes Project, Phase I, chr22, SNP calls === | |
- | + | ||
- | + | - Data files | |
- | + | - Source: release: 20100804, FTP: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/ | |
- | + | - SNPs: [ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz ALL.2of4intersection.20100804.genotypes.vcf.gz] | |
- | + | - Annotation file: [ftp://ftp.sanger.ac.uk/pub/gencode/release_3c/gencode.v3c.annotation.GRCh37.gtf.gz GENCODE (version 3c, hg19)] using CDS elements where ''gene_type = protein_coding'' and ''transcript_type = protein_coding'' | |
- | + | - Results: [http://dynamic.gersteinlab.org/people/lh372/vat_cgi?mode=process&dataSet=ALL.2of4intersection.20100804.chr22&annotationSet=gencode3c&type=coding VAT] | |
- | + | - [http://info.gersteinlab.org/VAT#Example_workflow Detailed workflow] | |
- | + | ||
+ | <br> | ||
+ | |||
+ | <center>[[#top|Top]]</center> | ||
+ | |||
+ | == Pre-processed GENCODE annotation sets == | ||
- | + | The pre-processed GENCODE annotation sets can be downloaded [http://info.gersteinlab.org/VAT/download#Download_of_pre-processed_annotation_sets here]. |
Latest revision as of 18:25, 14 June 2011
Contents |
Data sets
1000 Genomes Pilot Project: Low coverage samples
- Data files - Source: pilot_data, release: 2010_07, FTP: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/release/2010_07/low_coverage/ - Indels - CEU.low_coverage.2010_07.indel.genotypes.vcf.gz - JPTCHB.low_coverage.2010_07.indel.genotypes.vcf.gz - YRI.low_coverage.2010_07.indel.genotypes.vcf.gz - SNPs - CEU.low_coverage.2010_07.genotypes.vcf.gz - CHBJPT.low_coverage.2010_07.genotypes.vcf.gz - YRI.low_coverage.2010_07.genotypes.vcf.gz - Annotation file: GENCODE (version 3b, hg18) using CDS elements where gene_type = protein_coding and transcript_type = protein_coding - Results: VAT
1000 Genomes Project, Phase I, chr22, SNP calls
- Data files - Source: release: 20100804, FTP: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/ - SNPs: ALL.2of4intersection.20100804.genotypes.vcf.gz - Annotation file: GENCODE (version 3c, hg19) using CDS elements where gene_type = protein_coding and transcript_type = protein_coding - Results: VAT - Detailed workflow
Pre-processed GENCODE annotation sets
The pre-processed GENCODE annotation sets can be downloaded here.