Visualizing hg19 and hg38 FPKMs in a single plot
2
0
Entering edit mode
7.5 years ago
komal.rathi ★ 4.1k

Hello everyone,

I am trying to plot NCAM1 gene expression from 5 disparate RNA sequencing datasets - all are processed in the same way (STAR -> RSEM) and quantified in terms of FPKM. The problem is that I have 4 datasets mapped to hg38 and one is mapped to hg19. These are the coordinates of NCAM1 in hg19 and in hg38 from UCSC Genome Browser:

hg19: chr11:112,831,969-113,149,158
hg38: chr11:112,961,436-113,275,489

Can I plot NCAM1 expression across these datasets (in one plot) even though they were mapped and quantified using different genome references and annotations?

RNA-Seq • 2.9k views
ADD COMMENT
3
Entering edit mode
7.5 years ago

My guess is that from hg19 -> hg38, the only change will be in the coordinates, not in the gene structure and annotation per se. If I remember well, GENCODE does only the liftOver of cordiates from hg19 -> hg38. In that case you can safely plot the expressions in one plot. To be more sure, you can check if the gene structure from both annotations are same or not (It can be possible that they are quantifying different splice forms due to different annotations used.)

ADD COMMENT
0
Entering edit mode

Thanks - so the hg38 based data used Gencode and hg19 used Refseq. I don't know if there is much difference in the gene models - looking at the UCSC genome browser, they appear to be just slightly different.

ADD REPLY
0
Entering edit mode

Gencode and RefSeq could differ by small amounts at ends, probably because GENCODE is curated for accurate gene structure (both 5' and 3' end). But if the overall gene structure is same, some bp here or there will not change the FPKM (Note also that FPKM is normalized for transcript length)

ADD REPLY
0
Entering edit mode

Thanks for clarifying - can you move this to an answer so I can accept it?

ADD REPLY
0
Entering edit mode

can you move this to an answer so I can accept it?

How to do that?

ADD REPLY
0
Entering edit mode

I'm not sure if you can move it, but I can so I did :p

ADD REPLY
0
Entering edit mode

Appreciate that very much, thank you :)

ADD REPLY
2
Entering edit mode
7.5 years ago
seidel 11k

If you have FPKM values, then essentially the mapped reads have already been normalized to the appropriate gene structure, as Santosh notes in his comment "Note also that FPKM is normalized for transcript length". You can put them together in the same plot, but I would comment appropriately in the legend that one is from a different source. You still face the risk that the odd one is a different isoform, but unless you can figure this out explicitly the comment is your only safeguard.

ADD COMMENT

Login before adding your answer.

Traffic: 1942 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6