Question

RNA-seq biological replicates

0

Entering edit mode

6.5 years ago

amy16 ▴ 40

I've got PE Illumina Hi-seq RNA-seq data. I trimmed the adapters and then aligned the reads to the reference genome. Now before proceeding towards transcript assembly and quantification, I would like to know how to screen down which of the three biological replicates should I take forward for further analysis?

RNA-Seq • 1.9k views

ADD COMMENT • link updated 6.5 years ago by Kevin Blighe 87k • written 6.5 years ago by amy16 ▴ 40

2

Entering edit mode

6.5 years ago

GenoMax 141k

All three. If you have enough reads in all three then you could assemble them independently and then merge the data to create a more comprehensive representation of the transcriptome.

ADD COMMENT • link 6.5 years ago by GenoMax 141k

1

Entering edit mode

Got there while I was writing the answer. You're the master at comments!

We more or less give the same advice/

ADD REPLY • link 6.5 years ago by Kevin Blighe 87k

score 1 · Accepted Answer · 2017-11-08

Hey,

The question is somewhat strange because it implies that you decided to use replicates with the outlook that 1 or more of them would fail(?) I'm not sure that I would spent a couple of hundred pound sterling GBP or ~!0,000 Rupee if I was later going to decide to ditch 1 or more of the samples.

If you've used a Hi-Seq and the laboratory personnel is experienced, then I imagine that you can any of the replicates.

Procedures that most people do with replicates:

process them as separate samples and then, after normalization, check how they line up on PC1 vs. PC2 via principal components analysis
average counts over the replicates post normalisation (this was more common in cDNA microarray analysis)
concatenate the raw data FASTQ files together (zcat piped into gzip) and then process them as a single sample

You mention assembly, so, I would concatenate your samples together and then do the de novo transcriptome assembly on the concatenated sample. Whilst saying this, all transcriptome assemblers that I've used allow you to specify multiple samples at the command line and then it merges them together anyway.

If any of the samples 'failed', I doubt that you'd have the data in hand right now. You should be able to confirm the basic quality of the samples by contacting the lab that did the sequencing, or just check the reports that they sent.

Good luck, Kevin