CRAMTOOLS and common directory
From GersteinInfo
Using CRAMtools
The latest version of cramtools is already installed on Grace. To use it you need to first load the module files:
$ module load Langs/Java $ module load Tools/cramtools/3.0-b38
cramtools uses Java to execute cramtools. For example $ cramtools Version 3.0-b38
Usage: cramtools [options] [command] [command options]
Options: -h, --help Print help and quit (default: false)
Commands:
bam CRAM to BAM conversion.
cram BAM to CRAM converter.
index BAM/CRAM indexer.
merge Tool to merge CRAM or BAM files.
fastq CRAM to FastQ dump conversion.
fixheader A tool to fix CRAM header without re-writing the whole file.
getref Download reference sequences.
qstat Quality score statistics.
For a cramtools detailed description, please refer to the README file on github: https://github.com/enasequence/cramtools/commit/137c59bb92f8ddadeb700fa08b5dfa644a968444
NOTE: Running cramtools locally on the server instead of the cluster (through bsub script) may create an error during initialization of VM and space reservation.
Using CRAM for ‘gerstein/common’ directory
To reduce redundancy, manage and monitor big files, the Gerstein Lab has decided to encourage the placement of big files, with potential use by multiple lab members into the common directory. However, we have added a simple 2-step process to ensure that a file record is kept properly.
Step 1) Readme file repository for Gerstein lab
To share data: - Create a readme file under your data directory, with the exact file name readme.txt - Run update.sh under your data directory. Do scratch/fas/gerstein /common/update.sh msg, where msg is the short commit message Right now, blank is not allowed in msg Example: scratch/fas/gerstein/common/update.sh fixedReadFilefor1kgphase3
Detailed information on update.sh script can also be found on: https://github.com/gersteinlab/SharedData
NOTE: to run the update.sh file requires the creation of a github account/repository that will be linked with https://github.com/gersteinlab/.
Step 2) Fill in file information in lab document (to be implemented)
To further keep record of files in common directory on grace, the Gerstein lab runs an updatable file that needs to be completed with limited data information. This information includes: 1) Name of the file 2) Download date 3) Published reference (DOI) 4) Gerstein Lab published reference (DOI if possible) 4) 150 character description