finding out SNPs from SNVs and breed specific SNPs
0
0
Entering edit mode
5.0 years ago
prasundutta87 ▴ 660

Hi, I am working with a multisample VCF file for a non-model organism that contains only biallelic single nucleotide variants (SNVs). The samples belong to different breeds of the organism as well. I wish to find SNPs (single nucleotide polymorphisms), which is a population-based term (the SNV should be present in 1 % or 5 % of the population or all my samples in my VCF file). If I want to find out SNPs from my multisample SNV file, is removing variants having MAF <= 5 % or 1 % the correct way to get it?

My second question is that if I want to find breed-specific SNPs and variants shared by more than one breed, what is the way to get it? This is what I think can be a way, let me know if I am in the right track:

1) Separate the base multisample VCF file into multiple smaller VCF files where each VCF file contains samples specific to a breed

2) Set a genotype quality threshold and put genotypes to missing if GQ is less than the threshold

3) Remove monomorphic loci (where all genotypes are same, or only 1 allele is present)

4) Set MAF threshold to remove variants where MAF <= MAF threshold to get the SNPs

5) Use Venny or any other tool to compare variant IDs that are specific to a breed or shared amongst the breeds.

SNP SNV • 1.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 1968 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6