Why mach1 is not imputing SNPs?
1
0
Entering edit mode
6.0 years ago
moxu ▴ 510

The command lines for the two steps:

mach1 -p mysamples.22.merlin.ped -d mysamples.dat --vcfReference -h ~/xdata/reference/1kg/b37/ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz --compact -r 30 --prefix mysamples.mach1step1

mach1 -p mysamples.22.merlin.ped -d mysamples.22.dat --vcfReference -h ~/xdata/reference/1kg/b37/ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz  --crossover mysamples.22.mach1step1.rec --errormap  mysamples.22.mach1step1.erate --greedy --geno --quality --dosage --probs --mle --mldetails  --autoFlip  --mask 0.05 --prefix mysamples.22.mach1step2

The problem:

I have two 6758 SNPs in mysamples.22.merlin.ped & mysamples.dat as the input files for step 1. But after step 2, the result files (.mlinfo, .mlgeno, .mldose) still contain 6758 SNPs. No more, no less.

The question:

Why did the imputed SNPs go?

Thanks!

SNP software error • 925 views
ADD COMMENT
0
Entering edit mode
6.0 years ago
moxu ▴ 510

I think it's because mach1 does not take the option "--vcfReference".

mach-admix takes "--vcfReference" but the 1000 Genome phase3 reference files contain duplicated SNPs (e.g. ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz has two entries for rs7410429) and mach-admix aborts when it sees duplicated SNPs.

What do you do for the duplicated SNPs?

ADD COMMENT

Login before adding your answer.

Traffic: 2002 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6