Question

PanCanAtlas EBPlusPlus-corrected RNA-seq TCGA dataset

0

Entering edit mode

5.2 years ago

Ld_60 ▴ 70

Hi, I am wondering in which normalisation format (RPKM, FPKM, TPM,... etc) the PanCanAtlas EBPlusPlus-corrected RNA-seq TCGA dataset (the EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv file available here) is in? I know it is batch-corrected, but I don't know in which normalisation format the original data was in.

Thanks a lot for your help.

tcga RNA-Seq gdc ebplusplus • 5.9k views

ADD COMMENT • link updated 7 months ago by bstrs • 0 • written 5.2 years ago by Ld_60 ▴ 70

score 0 · Answer 1 · 2019-02-17

0

Entering edit mode

5.2 years ago

rahul.kulshrestha97 • 0

Hi, I am making a deep learning based multicategory tumor classification project, for which I have also downloaded the same file as my dataset, I wanted to know where can I get the associated 33 Tumor types for each TCGA- case data. Any Help will be useful.

ADD COMMENT • link 5.2 years ago by rahul.kulshrestha97 • 0

score 0 · Answer 2 · 2019-03-13

TCGA Uses the Fragments Per Kilobase of transcript per Million mapped reads (FPKM) and FPKM Upper Quartile (FPKM-UQ) methods for Normalisation. Usually, they normalize for sequencing depth and gene length. FPKM takes into account that two reads can map to one fragment (and so it doesn’t count this fragment twice). TCGA has clearly mentioned about its normalization methodshere & I quote " normalized using the Fragments Per Kilobase of transcript per Million mapped reads (FPKM) and FPKM Upper Quartile (FPKM-UQ) methods with custom scripts."

score 0 · Answer 3 · 2019-03-21

0

Entering edit mode

5.1 years ago

user31888 ▴ 130

The matrix 'EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv' was generated following the Firehose pipeline: MapSplice + RSEM, then normalised by setting the upper-quartile to 1,000.

Pipeline details here and here.

This was discussed in another thread here.

ADD COMMENT • link 5.1 years ago by user31888 ▴ 130

score 0 · Answer 4 · 2023-09-08

0

Entering edit mode

7 months ago

bstrs • 0

Are these RSEM values (from EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv ) corrected based on gene length also?

ADD COMMENT • link 7 months ago by bstrs • 0