How to retain transcripts from blastx output at a given coverage?

0

Entering edit mode

6.2 years ago

unawaz ▴ 60

I'm new to bioinformatics and such so I'm really consumed about this bit. I did a de novo assembly using Trinity, and now I'm trying to measure the number of transcripts that appear to be full-length or near full-length (as recommended by the Trinity pipeline). After running blastx, i get my blast output in tabular format (outfit 6) and I've ran the Trinity script analyze_blastPlus_topHit_coverage.pl. However, this script gives me the single best matching Trinity transcript for each top matching database entry.

What I want to do is calculate the number of Trinity Transcripts (rather than matching database entry) at a given coverage. For example out of 300,000 contigs, 20,000 100% coverage, and then about 60,000 have a coverage between 100-90% etc.

Would I just bin these according to the percentage identities? If not, how would I proceed to get my desired outcome?

Any help would be great :)

assembly RNA-Seq BLASTX Trinity • 1.3k views

ADD COMMENT • link 6.2 years ago by unawaz ▴ 60

Login before adding your answer.