How do you normalize Transcript per Million TPM to compare between samples ?
0
4
Entering edit mode
7.3 years ago
ZheFrench ▴ 570

UPDATE :

The question was initially "TPM Transcript per Million , gene / transcript length or both can be used ?"

I change the title...because I found answers to first questions by myself (so proud ^^) I saw there was a lot of views in a few time but with few answers...and I still have not the last word on the subject so I keep updating the post...think can be useful to others.

My post was initially about :

Can we calculate TPM directly from raw read count (from STAR output for example) ?

"Divide the read counts by the length of each gene in kilobases. This gives you reads per kilobase (RPK). Count up all the RPK values in a sample and divide this number by 1,000,000. This is your “per million” scaling factor. Divide the RPK values by the “per million” scaling factor. This gives you TPM."

Do you consider the total length of the gene or just the sum of the exon length ?

UPDATE : sum of exon length

I remind that sometimes I saw that transcript length was used...here but it's only when you align on transcriptome.. Is that true ?

For example, SALMON,KALLISTO give TPM values using speudo-alignment methods...

I don't know if it's correct to compute TPM from a genome alignment. "Transcript per Million" unit make more sense when you use transcriptome to (speudo)-align , no ?

Said differently, TPM values from speudoalignments (kallisto,salmon) can't be compare with the ones computed from an genome alignment. We need to know how the guy produced its TPM values before comparing.

UPDATE : Transcriptome looks nicer. Ok I'll use salmon and not try to re-calculate myself (by the way -g [ --geneMap ] arg will do the work fo me)

Ref :

https://groups.google.com/forum/#!topic/rsem-users/jJaeaSRG1eo

http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/

http://www.arrayserver.com/wiki/index.php?title=Omicsoft_RPKM/FPKM/Count_values

Calculating TPM from featureCounts output

https://gist.github.com/slowkow/c6ab0348747f86e2748b

​https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/

UPDATE - TPM normilsation quest :How I do that properly without using Sleuth !? ::/

https://www.biostars.org/p/143458/#157303​

https://groups.google.com/forum/#!topic/sailfish-users/jBf9SGiH1AM

https://f1000research.com/articles/4-1521/v1

RNA-Seq TPM STAR salmon quantification • 15k views
ADD COMMENT
0
Entering edit mode

When I was confused with the different quantification measures, I used to read documentations of this site :

https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/

I hope it could help you a bit with your reflexion !

Woups just saw that you already read it... Sorry :(

ADD REPLY

Login before adding your answer.

Traffic: 2139 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6