RNA-Seq Normalization and Batch Correction
0
0
Entering edit mode
6.8 years ago

Hello,

I have a few questions on the topic of batch correction.

The pipeline for this data is currently: TopHat2 (alignment to genome reference) > htseq-count (count reads per gene) > DESeq2 in R

I would really appreciate any thoughts you can offer, or tips on how your lab does things!

My questions are:

  1. How can you perform batch correction with sequencing day as a variable when not all samples were re-sequenced on both days?

Backstory: We had two batches of sequencing (Day 1 and Day 2). A few, but not all, libraries sequenced on Day 1 were re-sequenced on Day 2. Any suggestions for how I should perform Day1/Day 2 on this data? Is including "SequencingDay" into the design matrix of my DESeq2 object sufficient, or should I rely on SVA, or something else?

  1. Related to above: how can you perform batch effect correction when you have multiple batch effects and not all possible permutations of variables are represented in the samples?

Backstory: We have some batch effects that unfortunately aren't distributed evenly across all samples - e.g., of 10 samples, say we have suspected effects of Genotype, Sex, and Treatment - but don't have an example of a (Genotype1 + Male + Treatment2) sample. Is the solution basically 'pick your favorite batch effect' and only correct for that? So, in the example above, make your design matrix Genotype + Treatment, and ignore the effect of Sex? Is there a better way?

  1. How can I integrate spike-ins into my analysis pipeline?

In the alignment step, how do I make sure spike-ins are represented in the output file (i.e. gene counts), if there isn't a special version of the genome reference file? Does merely including spike-ins in our input data boost the accuracy of DESeq2's normalization algorithm enough? Or is there some other layer of normalization I should do when using spike-ins?

RNA-Seq R • 3.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 2304 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6