Aligning BAC to an assembled sequence
1
0
Entering edit mode
8.5 years ago
User000 ▴ 690

Hello,

I have several BAC sequences each of which contains 2-7 contigs. I need to align these BACs to a pseudomolecule and scaffolds. Basically, I need to identify a region were BAC maps on the pseudomolecule. At the end I need to get statistics such as alignment identity, aln lenth, coverage, start end. I can get all of these information from BLAST megablast. However, since megablast is local, it results in many HSP's obviously. Do you guys have an idea how can I calculate this and find the region on a pseudomolecule where each of my BAC aligns? if yes, then how? I can use bwa as an alternative, however, I dont know how to extract information from .bam file..

alignment blast • 3.3k views
ADD COMMENT
1
Entering edit mode

did you try BLAT instead of BLAST ?

ADD REPLY
0
Entering edit mode

I don't know if it solves my problem..the output is more or less similar to what I get with megablast...Do you think taking the best hit solves the problem?

ADD REPLY
0
Entering edit mode

in the ucsc, the BAC end sequences are "placed on the assembled sequence using BLAT".

ADD REPLY
0
Entering edit mode

Yes but they use only the ends not the full sequence and they have a scoring system to filter hits.

ADD REPLY
1
Entering edit mode
8.5 years ago
Darked89 4.6k

You may try to use LAST: http://last.cbrc.jp/

First create database from your assembly, then query it with your BAC fasta. You will get MAF output you can parse.

Caveat: repetitive contigs may be dropped from the output. But then it may be the correct behavior: Imagine that your BAC insert in reality is 100kb long and maps to scaffold/chromosome from assembly starting from position 1, with the last 10k (90-100kb BC & genomic) is full of repeats. On the top of it the next 50k of your assembly scaffold is also repetitive and full of stretches of Ns denoting gaps. You do not want to come to conclusion that your BAC is 150kb because something from BAC maps to these parts of scaffold.

ADD COMMENT
0
Entering edit mode

thank you very much. I have tried it, it gives me a lot of hits as well as BLAT, how can I decide which region to choose? the best hit?

ADD REPLY
0
Entering edit mode
There are two LAST utilities: maf-swap and maf-cull. What you can do is to swap top (scaffolds from db) sequence with a contigs seq from your BAC(s). Then cull all the hits except the top one. If you are into assembly checking/scaffold building, look also in LAST as a short read mapper (map reads to assembly). And in converting MAF output to SAM/BAM viewable in ie IGV. I will provide the link for this a bit later.
ADD REPLY
1
Entering edit mode

Here we go (Shameless self-plug mode on):

http://openwetware.org/wiki/Wikiomics:WinterSchool_day2#last

One can do this MAF to BAM coversion not just with mapped short reads but also with contigs/BAC sequences or other genomes. And verify some tricky spots by eye if necessary.

ADD REPLY

Login before adding your answer.

Traffic: 2419 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6