Question

Working backwards - How do I retrieve transcripts from a primer list using sequence and amplicon length?

0

Entering edit mode

5.4 years ago

dacotahm ▴ 20

I'd like to search a transcriptome with a list forward and reverse primers and use the amplicon length from the primer pair to identify the transcript used to design the primers.

I'm trying to salvage someone else's project and find the transcripts associated with a primer set. The transcriptome has a lot of duplication in it (for reasons not relevant to my question), so each primer has multiple hits. Most of the hits are from the duplicate transcripts or some possibly from isoforms.

Using standalone blast (NCBI-BLAST+ in Ubuntu bash) I'm able to retrieve a hit list for all my primers:

blastn -task blastn-short -query PrimerList.fa -db Trin_duped.fasta -out PrimersXDB.txt -outfmt '10 qseqid sseqid qlen length nident evalue ssstart send qseq sseq slen' -max_target_seqs 20

The problem is that there are multiple hits for each F/R primer and each combination has a different amplicon size. The number of possible combinations is beyond manual filtering. How do I identify only transcripts that match a F/R primer combination with a specified amplicon size (or very near range in the event of software weirdness, i.e. +- 10bp)?

transcriptome primer blast alignment • 1.2k views

ADD COMMENT • link updated 5.4 years ago by h.mon 35k • written 5.4 years ago by dacotahm ▴ 20

score 3 · Accepted Answer · 2018-11-13

3

Entering edit mode

5.4 years ago

h.mon 35k

Map the primers pairs as paired-end reads with bowtie (use -S to get sam output) and use the TLEN field to get the amplicon with the correct size.

ADD COMMENT • link 5.4 years ago by h.mon 35k

1

Entering edit mode

I found an additional solution is to fill the missing bases in the amplicon length with NNNNs and use "-task blast short" to align them as a single unit, but your solution works faster, thanks.

ADD REPLY • link 5.4 years ago by dacotahm ▴ 20