filter IMPUTE2 genotype probabilities on .metrics INFO value?
1
3
Entering edit mode
9.2 years ago
Krisr ▴ 470

Hi,

I have a dataset from dbGAP that provides imputed genotype probability files (.gprobs) for each human chromosome generated with IMPUTE2. Furthermore, a ".metrics" file is also included. The documentation says the INFO column within the .metrics file can be used to QC/filter low quality SNPs (INFO < .3) from the .gprob files.

I am having trouble figuring out which program can handle these 2 files for the purposes of filtering out those SNPs which have a INFO score < .3. I've looked at QCTOOL, fcgene, and GTOOLs. However, .gprob seems to be a beagle format, and I can not use it with the .metrics file (which is IMPUTE2) without each program throwing an error(s). I can probably write a perl script to do this, but given the apparent frequency of this task, I figured there must be a method for it already.

So. my question, is there a utility that can perform this task? I found a paper which did so, but they do not describe how.

We performed downstream analysis of the complete, imputed merged dataset to take into account the uncertainty of the imputed genotypes. We filtered data based on info score of 0.7 after looking at the distribution of markers at all possible info scores.

Any help would be greatly appreciated!

impute2 gprob GWAS dbgap imputation • 5.7k views
ADD COMMENT
1
Entering edit mode
9.2 years ago
Krisr ▴ 470

Ah, figured this out.

The data to calculate the INFO metric is contained within the gprobs file. Executing the below command determines the INFO metric for each SNP, then applies the filter.

./qctool -g chr22.gprobs -og subsetted.gen -info .3 1
ADD COMMENT

Login before adding your answer.

Traffic: 2601 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6