Biostar Beta. Not for public use.
HiSeq vs GA for RNASeqV2 on TCGA
0
Entering edit mode
4.4 years ago
XD • 10

I want to pull off RNASeq data from TCGA... but datasets are available in both HiSeq and GA platforms. Can someone please advise which dataset would be better to use and why? Thanks!

ADD COMMENTlink
0
Entering edit mode

You are going to have to be way more specific. There are something like 20 different cancer types in TCGA, and the data was produced across 5 years at different centers. The platforms evolved along with the project, and so the data spans a wide variety of sequencing platforms. (even early hiseq may be quite different than late hiseq in terms of read lengths, etc.)

ADD REPLYlink
0
Entering edit mode

Thanks, Chris. I am looking for colorectal adenocarcinoma, specifically.

ADD REPLYlink
0
Entering edit mode

These two options are quite the same in terms of the chemistry of the sequencing, but the HiSeq device is a newer sequencer so theoretically I would go with that data..

https://wiki.nci.nih.gov/display/TCGA/RNASeq+Version+2

ADD REPLYlink
0
Entering edit mode
14 months ago
Washington University in St. Louis, MO

Looking at the COAD data set in the data matrix, it appears that there's no overlap in the v2 RNAseq. Either a sample has GA data or HiSeq data. So what you use will depend on the questions you're asking. If you need all the samples, use both. If you're worried about sequencer-driven batch effects, the GA produced the vast amount of the data, so you'd probably want to exclude the HiSeq.

FWIW, I'd just use all of it - as long as the read lengths are similar, there's unlikely to be much in the way of difference - the chemistry is largely the same.

ADD COMMENTlink
1
Entering edit mode

If you use both, consider using batch removal or correction approaches such as RUVseq.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1