Alignment & Variant Calling explanation SNP and SNV.
0
0
Entering edit mode
6.2 years ago

As a beginner, I have a basic question about a bam alignment. After map my fastq reads (from a single individual) to a reference (bwa), I can see the variations, which I guess it includes sequencing errors, misalignment, errors in library preparation and real SNPs. In an haploid organism, I suppose there is only one possible correct result for each position so there is only one correct consensus. After do variant calling with samtools:

samtools mpileup -uf params bam | bcftools call -mv -Oz -o vcf

and with lofreq:

lofreq call -f ref -o outvcf mybam

I obtain two vcf with SNP's and SNV's (bigger than the SNP's file, as expected). How these programs mark a variation as a SNP or SNV if I am working with only one sample? The definition of SNP is, from wikipedia:

A variation in a single nucleotide that occurs at a specific position in the genome, where each variation is present to some appreciable degree within a population (e.g. > 1%)

Thanks!

map to reference vcf alignment ngs • 2.6k views
ADD COMMENT
0
Entering edit mode

Why do you think you have one file with "SNP" and one with "SNV"?

"SNP" stands for "SIngle Nucleotide Polymorphism". And "SNV" for "Single Nucleotide Variation".

The term SNP is more often used in talks. The problem is that "polymorphism" implice that the change in sequence is quite often and have little or no impact on the gene function. But people started to use this term for almost every change in sequence even for those which have influence.

So to avoid the irritation, whether there is an impact or not, SNV is a much better word.

fin swimmer

ADD REPLY
0
Entering edit mode

In which organism you work ? http://csb5.github.io/lofreq/commands/#call if you look this page it seems they use dbsnp to call SNP for human by default :

If you are dealing with human samples (or large genomes in general) we recommend the use of -s (source quality) in combination with -S dbsnp.vcf.gz

ADD REPLY
0
Entering edit mode

@finswimmer OK so leaving aside the definition of SNP or SNV, they just extract the variations against a reference (identifying and discarding the possible errors, misalignments, etc). I was confused with the SNP's definition. Thanks!

It is an insect but thanks anyway @Titus!

ADD REPLY
0
Entering edit mode

Cockroach? - Periplaneta spp.?

Just be wary of dbSNP - it is a grand mix of 'common' and 'rare' variants, many of which have clinical relevance and are even listed in ClinVar as pathogenic alleles.

ADD REPLY

Login before adding your answer.

Traffic: 2900 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6