TDF Coverage in Figure
11 months ago
Duarte, CA

I usually create .tdf files for .bam alignments, and I realize that the parameters used can have an effect on how coverage is visualized.

However, the middle panel of Figure 4C looks off to me (from Panneerdoss et al. 2018):

IGV screenshots

I don't remember encountering an issue when the scale is reported as 0-1, when there are clearly more than two levels of coverage (as in the middle panel of Figure 4C). Has anybody else seen a situation where the TDF normalization could explain this plot? I also know that you don't need a TDF file to see a scale of a zoomed-in view of read alignments, but somebody told me that this was because of the TDF normalization (and I don't want to say this is incorrect, without additional discussion).

NOTE: I also see the resolution in the above image isn't great. So, here is another view where you can see what I am describing (in the upper-left hand corner of each tract) a little better:

The one in the middle looks rather sparse (= small counts, few fragments sequenced therefore no "smooth" peak) and the normalization is probably RPKM (or any per-million approach without without binning) so a value of 1 is probably not too unexpected. IGV can accept multiple formats beyond TDF, such as BigWig, which can easily be scaled to RPKM or similar methods.

If anyone has trouble getting a proper high-resolution image, simply open the pdf version of the linked paper.

[removed first FPKM response]

While I have certainly encountered terminology discrepancies, I can't think of a scenario where a TDF track would show that middle diagram (although not being able to think of an explanation is certainly different than saying there isn't an explanation).

You can tell that some manual modifications were made to make the Figure readable in a paper (for example, the size of "chr" varies in the overall top-left corner of the 3 screenshots). So, I thought something might have been accidentally deleted during that manual process (such as accidentally changing the maximum from 10 to 1).

However, if I figure out a better explanation, then I will post it.

Also, when I was thinking about more - the FPKM also couldn't be quite the right explanation: you could create some sort of visualization with normalized exon counts, but you otherwise have just one FPKM value per gene (or transcript, if you have transcript estimates). And, if you created a exon visualization (which I think would be more like a bedGraph), it still wouldn't look like that plot.

So, I went back to edit the previous comment (so, people wouldn't think they could visualize FPKM with a BigWig to represent site/window coverage). I apologize for the confusion, but I left this comment so that I would be transparent about my previous mistake.

Nevertheless, thank you very much for the contribution to the discussion!


