Hello everyone,
I am trying to plot NCAM1
gene expression from 5 disparate RNA sequencing datasets - all are processed in the same way (STAR -> RSEM) and quantified in terms of FPKM. The problem is that I have 4 datasets mapped to hg38
and one is mapped to hg19
. These are the coordinates of NCAM1 in hg19 and in hg38 from UCSC Genome Browser:
hg19: chr11:112,831,969-113,149,158
hg38: chr11:112,961,436-113,275,489
Can I plot NCAM1 expression across these datasets (in one plot) even though they were mapped and quantified using different genome references and annotations?
Thanks - so the hg38 based data used Gencode and hg19 used Refseq. I don't know if there is much difference in the gene models - looking at the UCSC genome browser, they appear to be just slightly different.
Gencode and RefSeq could differ by small amounts at ends, probably because GENCODE is curated for accurate gene structure (both 5' and 3' end). But if the overall gene structure is same, some bp here or there will not change the FPKM (Note also that FPKM is normalized for transcript length)
Thanks for clarifying - can you move this to an answer so I can accept it?
How to do that?
I'm not sure if you can move it, but I can so I did :p
Appreciate that very much, thank you :)