Question

TopHat deletions and insertions

0

Entering edit mode

7.8 years ago

ashjay ▴ 40

How are the TopHat deletions and insertions identified? How does this relate to the mismatch and other alignment options used? I'm a little confused and would appreciate an explanation. Thanks!

RNA-Seq alignment • 2.2k views

ADD COMMENT • link updated 7.8 years ago by Fabio Marroni ★ 3.0k • written 7.8 years ago by ashjay ▴ 40

0

Entering edit mode

Could you be more specific and add some information? At least to me your question is unclear...

ADD REPLY • link 7.8 years ago by WouterDeCoster 47k

0

Entering edit mode

Sure - TopHat produces a deletions.bed and insertions.bed file.

1) How are these insertions and deletions identified?

2) When mapping reads to the genome, we allow for a certain number of mismatches and gaps - how do these relate to how tophat identifies deletions and insertions?

ADD REPLY • link 7.8 years ago by ashjay ▴ 40

score 0 · Answer 1 · 2016-08-20

0

Entering edit mode

7.8 years ago

Fabio Marroni ★ 3.0k

Hi, I found an answer to a similar question, which also gives the link to the proper documentation.

Maybe giving a look there might help.

The relation with the length of gaps that we allow is straightforward: the insertions and deletions have a maximum length, which is determined by the length of the gap we allow for.

ADD COMMENT • link 7.8 years ago by Fabio Marroni ★ 3.0k

0

Entering edit mode

1) So if an alignment works best with a gap, we consider the gap to be a deletion and vice versa for an insertion?

2) Maybe I'm missing something but this is all I found about insertions and deletions on the TopHat man page:

insertions.bed and deletions.bed. UCSC BED tracks of insertions and deletions reported by TopHat. Insertions.bed - chromLeft refers to the last genomic base before the insertion. Deletions.bed - chromLeft refers to the first genomic base of the deletion.

Maybe what you said is all there is to the insertions and deletions - that a high scoring alignment with a gap indicates the presence of a deletion in that region?

ADD REPLY • link 7.8 years ago by ashjay ▴ 40

0

Entering edit mode

I think so. I guess that some consistency check is also performed (i.e. several reads overlapping the indel must support it), but I think it basically this. However, this is just my intuition, and I never put much attention into this, so interpret my words with caution.

ADD REPLY • link 7.8 years ago by Fabio Marroni ★ 3.0k