More Efficient Blat Algorithm
0
0
Entering edit mode
9.8 years ago
blakeln • 0

Hello- I am using blat to match a set of sequences against the human genome. The problem is, for my particular research, I don't care WHERE or HOW MANY TIMES the individual sequences match with the human genome (which is information I can receive from the output .psl file). All I care about is IF a particular sequence matches anywhere (even just once).

If it does match at least once, I will deem that sequence "human" and will then separate the non-human and the human sequences (non-matched vs. matched).

Since I am inputting close to a million sequences, it would probably save hours of computational time to change the code of blat to STOP searching for where ever else a sequence matches up after it finds just one match, then output just that first match to the output .psl file, and move on to the next sequence.

I've been searching the blat website to see if this capability exists and have found nothing. If I am confident it does not exist, then I will try to change the source code of blat to accommodate my needs, but I first wanted to see if anyone on this forum has heard of this being done already before I spend time on it.

Please respond if you know more about this.

Thanks!

blat efficiency source-code • 2.1k views
ADD COMMENT
0
Entering edit mode

How long are the sequences? You might be able to just use bowtie2 or bwa, which would both require much less time.

ADD REPLY
0
Entering edit mode

Hello blakeln!

It appears that your post has been cross-posted to another site: SEQanswers

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
0
Entering edit mode

Hi Devon-

Thank you very much for your recommendation(s). I will look into bowtie2 and bwa. I have never used bwa so I am interested to see it's capabilities.

ADD REPLY

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6