I have been confused on how to obtain the strand information from mapping result. In a paper, the authors have calculated the mutation rate on sense strand and antisense strand by using WGS sequencing data( to study somatic mutation),for my understanding, the strand information should be derived from the mapping pattern, e.g. if the paired-end reads mapped to the reference (antisense strand) in a FR way, it suggest the DNA are from antisense strand, while RF suggests DNA is from the sense strand, please correct me if I am wrong.
If PCR has been done on the library before adapter ligation, I think the strand information should gone, however, my question is if the PCR is done after adapter ligated, say using P7, P5 primers,in this case whether strand information retained or not?
ADD REPLY does not work for me know.
The paper studying strand-specific mutation pattern is http://www.ncbi.nlm.nih.gov/pubmed/25271376, they did not mention how this was done.
No worries, I moved it. You probably need some minimal reputation score to post comments.
That's annoying that they didn't write how they did library prep. Presumably they tweaked the protocol, but you'd have to email them to find out how.
which paper ?
It'll depend on exactly how they did the library prep. Normally people don't bother with strand-specific DNAseq, but the normal protocols could always be modified to allow this (cf., RNAseq any MethylCSeq).
To Devon Ryan,
I am planing to ask them, for my understanding, if no PCR was done during the library preparation, the strand information should be kept. I have a library with PCR after hybridization enrichment(using P7,P5 primers), I don't know if strand information is kept or not.
By the way, during cluster generation, are there only two types probes on the flow cell, namely P7 complementary and P5,thus only P7 could be attached to the probes on the flow cell first, P7 complementary should not be able to attach(generated by PCR), am I right? if P5 complementary exists, do they also binds? Woo, totally confused.
I found this thread, should be helpful, http://seqanswers.com/forums/showthread.php?t=66237 .
You can do PCR in a stranded way with multiple methods, so it's not the PCR thats really the issue. The issue is does the data really retain the strand information, which you'll only know if you download and see for yourself - https://www.ebi.ac.uk/ega/studies/EGAS00001000968
You'll have to send their DAC (in this case Sanger) a nice email asking for permission to download the data. They will make you pinky-promise not to share it. This is how modern science handles data sharing.
EDIT: I should clarify that in most methods, the PCR is standard but the ssDNA/RNA has been tagged already at an appropriate end. There are PCR reactions where extension is tagged though.
Right, there are only two probes for things to hybridize to, so if strand information was kept during library prep it'll be maintained on the flow cell.
Trying to trace the path of strands during library prep and sequencing is a sure fire way to get confused :(