Seq. depth for DE analysis
2
0
Entering edit mode
6.6 years ago

Hi Biostars,

I have RNAseq data for time-course experiment with 5 time points with three replicate in each.

Some replicates have twice lower sequencing depth than others. I want to check if these low sequencing depth replicates will affect somehow DE analysis. For this, I have normalized raw counts in replicates by (library size*10e6) and did plotPCA. On PCA I see good clustering of samples by time points, and replicates with lowers seq. depth don't appear to be outliers. The same trend is for RLE plots.

Is this enough to assume that differences is sequencing depth will not affect DE analysis? I will use Deseq2 for DE.

Thanks

DE-analysis RNA-Seq • 1.8k views
ADD COMMENT
1
Entering edit mode
6.6 years ago
h.mon 35k

I think PCA alone is a not enough, although it is one good diagnostic. Why some samples had such lower depth? The cause may point to potential problems you will have with your analysis.

Another good diagnostic is a saturation plot, to check if all libraries have appropriate depth of sequencing.

ADD COMMENT
0
Entering edit mode

Thanks for suggestion! Some samples have lower depth because this is dual-RNAseq experiment, which means that in our case human RNA was mixed with yeast RNA, and then library prep was made. And sometimes it is hard to control the proportion between human and yeast RNA in the pooled sample. But in all replicates the parameters as mapping rate, phred scores and other QC parameters are high.

ADD REPLY
0
Entering edit mode
6.6 years ago
Renesh ★ 2.2k

Sequencing depth surely will affect your DE analysis and will give you inflated statistical significance. Higher sequence depth library will obviously produce more reads for equally expressed mRNAs than lower depth library.

To avoid your DE analysis with unequal sequence depth library, you must normalize your reads counts to RPKM/FPKM/TPM units. These units normalize your data to per million mapped scaling factor which corrects for the difference in sequencing depth among different libraries. PCA alone will not solve your issue.

http://bioinfogeek.over-blog.com/2017/09/gene-expression-units-explained-rpm-rpkm-fpkm-and-tpm.html

ADD COMMENT
0
Entering edit mode

Thanks for suggestions, but I don't think performing DE with normalized units is a good idea (especially with Deseq2 or similar software), since they require count data and perform their own internal normalization. My concern was whether even after normalization samples with low depth will bi biased somehow.

ADD REPLY
0
Entering edit mode

If your library have different sequencing depth, it is recommended to have RPKM/FPKM/TPM normalization. These normalized units will be uniform across libraries and will give you reliable analysis. You can use cuffdiff for expression analysis. You can also use R packages for DE without normalizing these RPKM counts.

DEseq2 is designed to account for different library sizes (http://www.bioconductor.org/help/workflows/rnaseqGene/).

ADD REPLY

Login before adding your answer.

Traffic: 1501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6