Why not simple Gaussian model instead of Gaussian mixture for VQSR?
0
0
Entering edit mode
5.9 years ago
CY ▴ 750

As for as I understand, VQSR selects a pool of SNP existing in both testing set and know annotated SNP database. These SNP will be considered as true variants and a Gaussian mixture model is established based on the features of these true variant to classify additional SNP.

These true SNPs will be clustered using Gaussian model. However, Gaussian mixture model means we are also cluster "bad" SNPs as well. I imagine that these "bad" SNPs have poor qualities differently and the they will be classified as multiple clusters by mixture model (one true SNP cluster and multiple bad SNP clusters), right?

Then Why can't we just use a simple Gaussian model to just draw distribution of true SNP and any SNPs far from this cluster will more likely to be false?

VQSR GATK • 1.2k views
ADD COMMENT
0
Entering edit mode

Your question may receive a better response on the GATK Forum

ADD REPLY

Login before adding your answer.

Traffic: 2770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6