How to align given specific region in genome with manifest file in bwa?
1
1
Entering edit mode
5.1 years ago

Usually when I use alignment with bwa, I follow this script:

bwa aln reference.fa sample.fastq.gz > sample.sai

But this time I have manifest.txt and I want to target specific regions with this and I want to speed the alignment process with specific output. If It is possible in BWA, I would like to get your answers. Any help is much appreciated.

Thanks in advance.

bwa manifest genome alignment • 3.9k views
ADD COMMENT
2
Entering edit mode

what is manifest.txt ?

https://commons.wikimedia.org/wiki/Category:Crystal_balls#/media/File:Paulinefrederick_5_retouched.jpg

ADD REPLY
0
Entering edit mode

Manifest.txt is the file that specifies a reference genome and targeted reference regions to be used in the alignment step. This definition is from MiSeq System Guide. Manifest.txt is given to MiSeq, but I want this process done on computer. Example for manifest file and I would like to know If I can do same procedure with BWA.

[Header]                        
XT Manifest Version 1                   
ReferenceGenome Homo_sapiens\UCSC\hg19\Sequence\WholeGenomeFasta                    

[Regions]                       
Name    Chromosome  Amplicon Start  Amplicon End    Upstream Probe Length   Downstream Probe Length 
FMF         chr16           3293140                 3306587          50                                 50  
BRCA1   chr17           41194312           41279500          50                                 50
ADD REPLY
2
Entering edit mode

I don't think you can limit alignments to certain regions from genome using an intervals file with bwa. You will need to filter the alignments afterwards if you are only interested in specific regions.

You could create a reduced representation genome by extracting the areas you are interested in and creating a new index. This always carries the risk that reads may be forcibly aligned to regions where they did not come from. So the approach of post-filtering alignment will always be better.

Note: If you are doing this for an amplicon workflow then you can go with creating a reduced representation reference with areas of your interest.

ADD REPLY
0
Entering edit mode

If I can not do this with bwa then which tool can I use to do the alignments for certain region?

ADD REPLY
0
Entering edit mode

MiSeq reporter manual says this for custom amplicon workflow:

The Targeted RNA workflow utilizes the banded Smith-Waterman aligner. Improvements have been added to allow alignments across very small amplicon targets (often less than 10 bp).

PCR amplicon protocol uses bwa.

If you are looking to reproduce either workflows then you will need to take this into consideration. How this is exactly implemented in Illumina software is not stated in the manual.

If you are doing PCR amplicon, use the strategy noted by @Pierre and then use bwa.

ADD REPLY
3
Entering edit mode
5.1 years ago

you need to map everything first, convert this "manifest.txt" file as a bed file and extract the regions. Samtools And Region List

ADD COMMENT
0
Entering edit mode

Thanks you helped me a lot. The following script did the job: samtools view -b -L manifest.bed input.bam > manifest_output.bam

I have one more question. Script above outputs alignments overlapping with the manifest.bed file. Is it possible to get output with non-overlapping alignments?

ADD REPLY
0
Entering edit mode

Is it possible to get output with non-overlapping alignments?

  -U FILE  output reads not selected by filters to FILE [null]
ADD REPLY
0
Entering edit mode

I tried that but I couldn't get it to work. It just gives me the same bam file. I changed -L with -U.

ADD REPLY
0
Entering edit mode

you need to use both options

ADD REPLY
0
Entering edit mode

samtools view -b -L manifest.bed -U non-overlapping.bam input.bam > overlapping.bam

I used both options this way. But this time my overlapping.bam changed and It is not the same output with only -L. I am newbie so, I am sorry If I can not see the obvious.

ADD REPLY
0
Entering edit mode

ok I think I'm wrong with '-U' that do not work in conjonction of '-L'

another idea is to use 'bedtools' complement https://bedtools.readthedocs.io/en/latest/content/tools/complement.html to get the complement of the bed and run samtools view -L complement.bed

ADD REPLY

Login before adding your answer.

Traffic: 2471 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6