ion torrent alignment software
2
0
Entering edit mode
8.6 years ago
bioguy24 ▴ 230

I have genomic resequencing medical exome data (~4500) sequenced on an ion torrent. In terms of alignment, besides TMAP, any thoughts on samtools, bwa-mem, bowtie2, novalign, or other alignment software. Thank you :).

ngs alignment • 5.5k views
ADD COMMENT
1
Entering edit mode

Bowtie2 or bwa-mem.

ADD REPLY
0
Entering edit mode

Thank you :)

ADD REPLY
0
Entering edit mode

My very first thought is SAMtools sucks at performing alignments.

ADD REPLY
0
Entering edit mode
Ok any other thoughts or publications or user experiences? Thank you :)
ADD REPLY
3
Entering edit mode
8.6 years ago
h.mon 35k

I've never dealt with Ion data, but from what I've read its main sequencing errors are indels (if not "main", at least they are common, unlike Illumina data). So you have to use a mapper which allows for indels, and maybe tweak the parameters to decrease gap penalty, and most probably realign later. There are some tools, such as PyroTools, and also protocols, specific for 454 / Ion data. This paper compare mappers on Ion data, against bacterial genomes though (spoiler: there is no clear "overall best mapper"). Finally, BBMap seems like a good fit as well.

ADD COMMENT
0
Entering edit mode

Thank you :).

ADD REPLY
4
Entering edit mode
8.6 years ago

I have an extensive experience dealing with Ion Torrent data and it is true that reads show high rate of homopolymer errors as suggested by h.mom. For some samples, 30% of reads (reference RNA-seq) require an indel to align against the reference genome. Ion proton system can be considered as a fancy pH meter that detects release of protons and decides which nucelotide has been added based on numbers of protons released. In the region where you have repeats of the same nucelotide (for example AAAAA), it is sometimes hard for it to resolve and it over or underestimates the real count. As a result, such reads need to be aligned using insertion or deletion depending on if the sequences over or under estimated the number of bases. I would avoid changing the scoring scheme of the alignment. For alignment, you should increase the edit distance because of the homopolymer errors and also because of the fact that reads are loner (around 150 bp) in length. You should also increase maximum insertions or deletions allowed in a read. Also increase the length of the biggest gap allowed. This has helped me.

ADD COMMENT
0
Entering edit mode

Ashutosh Pandey do you mind if I email you offline to discuss a bit more? The lab has been running Ion Torrent for about 2 years now and we are moving towards exome next. Thank you :).

ADD REPLY
1
Entering edit mode

Please feel free to email me at ashutoshmits at gmail. We have mostly used it for RNA-seq so the tricks that I talked above worked because our goal was to increase the mapping efficiency. I am not sure how allowing more more errors during alignment to increase the alignment rate would affect the downstream variant calling results. In our case we only use uniquely aligned reads for quantification of expression. The good part with longer reads is that even if you are little liberal with alignment you may still be able to align reads uniquely. I think you will have to perform a thorough filtering on your vcf files. I may or may not be making sense right now but we can talk about it over email.

ADD REPLY

Login before adding your answer.

Traffic: 1479 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6