Biostar Beta. Not for public use.
ddRADseq and segregating SNPs
0
Entering edit mode
21 months ago
biogirl • 170
European Union

Hi there,

I'm completely new as-of-this-morning to ddRADseq, but am trying to get my head around the theory. If I have a 30 Mb genome and use ddRADseq with ~6500 digestion sites, how many segregating SNPs can I find (roughly)? WGS shows that there's roughly 20000 SNPs separating each isolate.

Does this depend on the amount of the reference genome covered by the ddRAD?

Any help is greatly appreciated - thanks.

ADD COMMENTlink
2
Entering edit mode
19 months ago
SNPsaurus • 50
Eugene, OR

If you have 6500 digestion sites within your planned fragment size selection range, and sequence with 100 bp reads, then you will be sampling 6500 x 100 = 650 kb. Then if the SNPs are spaced every 1.5kb (30Mb/20000 SNPs), you should end up with ~400 SNPs total that you assay. If you sequence fragments that are 150-200 bp with 100 bp paired-end reads, you'll sample more of the genome and have more SNPs.

How did you figure the 6500 digestion sites? Most ddRAD protocols cut with a 6-cutter and a 4-cutter enzyme, but there are probably 6500 6-cutter enzyme sites in your genome. With ddRAD, you have to find the subset of fragments that have the two enzyme sites in the exact size range desired. Just checking... maybe you did all that.

ADD COMMENTlink
0
Entering edit mode

6500*200 may be ? Because ddRad seq is generally PE sequencing.

ADD REPLYlink
0
Entering edit mode

Right, that's why I included the "If you sequence fragments that are 150-200 bp with 100 bp paired-end reads, you'll sample more of the genome and have more SNPs."

I thought it might be helpful to start with the simpler case of always being 100 bp to show how it is done. Sequencing a size range is less exact (150-200 bp fragments with PE) since it depends on the distribution of fragment sizes.

ADD REPLYlink
0
Entering edit mode

Ah, I see! That makes complete sense, thank you for going through that.

In reply to the 6500 digestion sites, I would actually have more because I'd use two cutters (as you mention). 6500 was just for the one cutter. But thank you for your reply, the theory makes sense now.

ADD REPLYlink
1
Entering edit mode
13 months ago
geek_y 9.7k
Barcelona/CRG/London/Imperial

Here is a code ( adopted from Peterson Et al) to double digest your genome.

Usage: edit the restriction sites and give full path to your genome file. Then

 RE_Digestion.py > Rest-sites.txt

If you would like to select fragments with specific size:

RE_Digestion.py | awk '{ if( ($3-$2 ) >=300 && ($3-$2) <= 500 ) print }' > 300_500_sites.txt
ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1