Question

RNAseq

0

Entering edit mode

5 weeks ago

mrymrahimi70 • 0

I want to work on an RNAseq project and I got my initial data from different projects. One set of data has genome build hg38, some other series have genome build hg19. The counts files of these data that I obtained had different numbers of reads.

Do the data have to have the same genome build?

Because when we continue analysis in R and merge the data, they must have the same row and column. Im a bit confused.

Please consider that I'm new to RNAseq analysis.

Any help will be appreciated.

R RNA-seq Linux • 465 views

ADD COMMENT • link 5 weeks ago by mrymrahimi70 • 0

score 0 · Answer 1 · 2024-03-21

0

Entering edit mode

5 weeks ago

GenoMax 141k

One set of data has genome build hg38, some other series have genome build hg19.

No they don't. Genome build hg19 = GRCh38. Where as hg38 = GRCh38 build

You can find the human genome build information here:

GRCh38 (current) : https://www.gencodegenes.org/human/
GRCh37 (previous build): https://www.gencodegenes.org/human/release_45lift37.html

You should re-align the data in both cases to the recent genome build to be sure that you know what exactly happened with the data. Inheriting data/results of non-clear origin can be problematic later on, if you simply go on.

ADD COMMENT • link 5 weeks ago by GenoMax 141k

0

Entering edit mode

hops , I deleted my answer because https://xkcd.com/745/

ADD REPLY • link 5 weeks ago by Pierre Lindenbaum 161k

0

Entering edit mode

Thank you so much for your explanation. Got it.

ADD REPLY • link 5 weeks ago by mrymrahimi70 • 0