Hi, I am wondering in which normalisation format (RPKM, FPKM, TPM,... etc) the PanCanAtlas EBPlusPlus-corrected RNA-seq TCGA dataset (the EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv file available here) is in? I know it is batch-corrected, but I don't know in which normalisation format the original data was in.
Hi, I am making a deep learning based multicategory tumor classification project, for which I have also downloaded the same file as my dataset, I wanted to know where can I get the associated 33 Tumor types for each TCGA- case data. Any Help will be useful.
TCGA Uses the Fragments Per Kilobase of transcript per Million mapped reads (FPKM) and FPKM Upper Quartile (FPKM-UQ) methods for Normalisation. Usually, they normalize for sequencing depth and gene length. FPKM takes into account that two reads can map to one fragment (and so it doesn’t count this fragment twice).
TCGA has clearly mentioned about its normalization methodshere & I quote " normalized using the Fragments Per Kilobase of transcript per Million mapped reads (FPKM) and FPKM Upper Quartile (FPKM-UQ) methods with custom scripts."