Detecting inversions in psbA-trnH (plant)
0
0
Entering edit mode
7.2 years ago

What kind of software tool would you recommend as best suited to detect psbA-trnH inverisons? I am using R usually, however I cannot find any suitable packages. My sequences are in multiple .fasta files and I have around 30000 sequence in total.

Here is an example: http://journals.plos.org/plosone/article/figure/image?size=large&id=10.1371/journal.pone.0011533.g001

These guys developed a pipeline: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3498263/ However, thier web page is down.

I do want to identify the inversion and turn it around. I do not want to mask or delete it.

Thanks!

psbA alignment inversion • 1.6k views
ADD COMMENT
0
Entering edit mode

You should be more precise with regard to the technology you used, "fasta files" isn't very specific. The answer to your question would probably be very different if those files are derived from Solid, Illumina, Sanger or Oxford Nanopore sequencing data.

ADD REPLY
0
Entering edit mode

I have a .fasta file containing the sequences of Sanger single read, one direction sequencing only, 1x coverage

I have no access to sequencing data. The inversions are anywhere between 5bp and ? bp. So wham is too rough (they say they can only detect >50bp). Here is another example: https://www.researchgate.net/profile/Wojciech_Bieniek/publication/271952841/figure/fig1/AS:331540467863561@1456056815704/Fig-1-Multiple-alignment-of-the-highly-variable-part-of-the-trnH-psbA-region-from-the.png (dotted line) To make it even more difficult the inversion is not at the same place in all sequences. There are groups of sequences that have the same inversion, but not all have the same one.

ADD REPLY
0
Entering edit mode

So this boils down to comparing a fasta file with the reference sequence and detect differences?
When you write psbA inversion I assumed you mean a whole gene inversion, but I'm not really aware of plant genetics.

ADD REPLY
0
Entering edit mode

Technically psbA-trnH is an intergenetic spacer. I have no reference sequence, I compare everything to everything and try to identify the inversion groups within the psbA sequences. I don't know where and how long the inversions are. Please take a look at the first example (its a picture) to understand what I try to explain. To make it even worse, some sequences have a deletion within the inversion region.

ADD REPLY

Login before adding your answer.

Traffic: 2138 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6