Convert vcf files with phased genotypes to standard haplotype format
2
5
Entering edit mode
6.3 years ago
Mr Locuace ▴ 160

Hello, I have a set of .vcf files with phased genotypes that I would like to convert to standard haplotype format (.hap). Is there any tool that makes this?

Here is a toy example of a .hap file of a population of 2 human individuals, with 12 SNPs each:

1  1 2 2 1 2 2 2 1 1 1 1 1
2  1 1 1 2 1 1 2 2 1 2 1 2
3  1 1 2 1 2 2 1 1 2 2 1 2 
4  2 2 2 2 2 2 1 1 1 1 1 1

Each line represents a chromosome, the first element being the haplotype ID. 1 and 2 represent ancestral and derived alleles, respectively.

Many thanks

haplotype vcf .hap • 8.2k views
ADD COMMENT
4
Entering edit mode
6.3 years ago

Assuming all of your genotypes are phased, this should be doable with recent plink 2.0 alpha builds (which you can download from https://www.cog-genomics.org/plink/2.0/ ):

plink2 --vcf phased.vcf --export haps --out new_filename_prefix
ADD COMMENT
0
Entering edit mode

Thank you for your help @chrchang523 !. The solution you provide outputs minor/major alleles (0/1) but not ancestral/derived alleles. Do you know how can I get the latter coded as 0/1?. Thanks again

ADD REPLY
0
Entering edit mode

plink2 codes the REF allele as 1 and the ALT allele as 0 in the .haps file. If they don't always correspond to the ancestral/derived allele distinction you want, you can use --ref-allele or --alt1-allele (https://www.cog-genomics.org/plink/2.0/data#ref_allele ) to swap them around when necessary (you'll need to create a tab-delimited table with all the variant IDs and ancestral/derived alleles).

ADD REPLY
0
Entering edit mode

Hi, I am encountering the same problem with my VCF files and hap files creation with Ancestral/Derived Allele. I need that for running rehh package. Was your problem solved?

ADD REPLY

Login before adding your answer.

Traffic: 1833 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6