I want to work on an RNAseq project and I got my initial data from different projects. One set of data has genome build hg38, some other series have genome build hg19. The counts files of these data that I obtained had different numbers of reads.
Do the data have to have the same genome build?
Because when we continue analysis in R and merge the data, they must have the same row and column. Im a bit confused.
You should re-align the data in both cases to the recent genome build to be sure that you know what exactly happened with the data. Inheriting data/results of non-clear origin can be problematic later on, if you simply go on.
hops , I deleted my answer because https://xkcd.com/745/
Thank you so much for your explanation. Got it.