increase in SNP frequency towards the 3 end of reads
1
0
Entering edit mode
8.6 years ago
c.gubili • 0

Hi all,

I am new in stacks and I would like to ask a general question. I have been trying to assess the increase in SNP frequency towards the 3' end of my reads and I would like to hear your suggestions.

First, reads were demultiplexed based on their individual barcodes (8bp) and quality filtered using process_radtags. Then, I run preliminary analyses (denovo, default parameters) with no further trimming/filtering and I found an increase in SNP frequency towards the 3' end of the reads. I read that the latter probably corresponds to spurious polymorphisms resulting from increasing amounts of sequencing errors. I decided to trim 9bp from the 3' end, and ended up with 83bp.

I rerun denovo, checked the number of SNPs and produced a second graph for my new "83bp" dataset. However, instead of seeing an even number of SNPs across bases, there has been an increase of SNPs at the end of the 3' end (about 400 SNPs for the last 5 bp). I trimmed 5 more bp for a second time, and dropped to 78bp. But I get the same trend. Any suggestions?

Thank you in advance,
Chrysa

next-gen SNP • 1.5k views
ADD COMMENT
0
Entering edit mode

What was the initial quality threshold?

ADD REPLY
0
Entering edit mode
8.6 years ago

Variants near the ends of reads are not very reliable, even if the read is error-free. Indels will come out looking like SNPs, for example. It's best to give all variants within X bp (where X is ~10) a lower confidence or just ignore them.

ADD COMMENT

Login before adding your answer.

Traffic: 1770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6