Question

Mismatched frequencies of complementary bases after sequencing

0

Entering edit mode

7.8 years ago

L. A. Liggett ▴ 120

I am performing amplicon sequencing of the human genome, which just amplifies up regions of the genome and sequences them. Everything seems to work just fine, but I observe a phenomenon that I am unable to rationalize.

If I stratify variants by base change, (ie separately bin A -> T changes, A -> G changes, etc.) I find that complementary base change frequencies do not always match. For instance it would be expected that since DNA is double stranded, roughly every time an A -> G change is observed the complementary T -> C change should be observed.

This is almost always the case, but I do get repeatable strong mismatches in certain base mutation frequencies. For example something like 1000 observed A -> G variants but only 10 observed T -> C variants.

Biologically this does not make sense to me. Is there something that can account for this phenomenon?

next-gen sequencing genome • 1.3k views

ADD COMMENT • link updated 7.8 years ago by WouterDeCoster 47k • written 7.8 years ago by L. A. Liggett ▴ 120

score 0 · Answer 1 · 2016-07-14

0

Entering edit mode

7.8 years ago

WouterDeCoster 47k

Dependent on the structure of your nucleotide base: https://www.mun.ca/biology/scarr/Transitions_vs_Transversions.html

Depending on whether your example was random or real, this is a partial answer.

In addition, most common mutation is methyl cytosine to T (oxidative deamination)

ADD COMMENT • link 7.8 years ago by WouterDeCoster 47k

0

Entering edit mode

@WouterDeCoster this explains why C -> T and its complementary change of G -> A would be more prevalent in the data. But those two changes should be more or less equally observed in frequency. What I am seeing for some bases is the complementary bases not matching in frequency. So for this example it would be like G -> A is observed far more often than C -> T. Make sense?

ADD REPLY • link 7.8 years ago by L. A. Liggett ▴ 120