Biostar Beta. Not for public use.
Tool:bed_to_tabix: Download the variants from 1,000 Genomes in the regions defined in one or more BED files.
1
Entering edit mode
21 months ago
Buenos Aires, Argentina

I wrote this tool to easily get variant genotypes from different populations, using the data from The 1,000 Genomes Project. You just provide one or more BED files and you get a VCF.

I hope it's useful!

--

bed_to_tabix

bed_to_tabix will download a gzipped VCF file with the 2,504 genotypes from The 1,000 Genomes Project at the regions defined in one or more BED files. The utility will specifically handle for you the BED sorting, merging of many BEDs, parallel-downloading of the different chromosome variants with tabix (you can even use HTTP URLs in case your FTP traffic is blocked) and it will merge the resulting VCFs in a single gzipped VCF. Afterwards, it will perform a cleanup of the temporary files, so you're done with a single results file.

bed_to_tabix is written in Python, but it can be used as a command line tool without any knowledge of the language.

Installation instructions here: https://github.com/biocodices/bed_to_tabix

Example Usages:

# Download the regions in regions1.bed to regions1.vcf.gz
bed_to_tabix --in regions1.bed

# Download the regions in regions1.bed, 10 downloads at a time, to 1kg.vcf
bed_to_tabix --in regions1.bed --threads 10 --unzipped --out 1kg

# Download the regions in both bed files to regions1__regions2.vcf.gz
bed_to_tabix --in regions1.bed --in regions2.bed

# Download from the HTTP URLs in case your traffic to FTP is blocked
bed_to_tabix --in regions1.bed --http
ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1