Question

How To Best Get Up To Speed In Dealing With Rnaseq Data ?

8

Entering edit mode

11.9 years ago

Wayne ★ 1.0k

Hello all, I have some experience with next generation DNA exome sequencing but in the next few weeks I will be getting in RNAseq data for which I do not have experience with. The data will be mapped already. I want to hit the ground running when the data gets here and want to prepare by being familiar with programs ill need to use. The goals are to: 1. Check to validate mutations identified in DNA from exome sequencing 2. Check expression levels 3. Check for fusions 4. Open to suggestions for other things to do.....?

I've really tried to find reviews and tutorials to get me up to speed but haven't had much luck. Any reviews, tutorials, or software recommendations of things I should definitely study or practice with would be extremely appreciated! Thanks so much for your time.

rna-seq rna sequencing expression • 6.0k views

ADD COMMENT • link updated 8.2 years ago by dnaseiseq ▴ 220 • written 11.9 years ago by Wayne ★ 1.0k

score 8 · Answer 1 · 2012-05-16

8

Entering edit mode

11.9 years ago

User 59 13k

I'll go for the obvious recent paper:

http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html

ADD COMMENT • link 11.9 years ago by User 59 13k

0

Entering edit mode

nice find, haven't see this one, I will post it in the tutorial section as well

ADD REPLY • link 11.9 years ago by Istvan Albert 100k

score 7 · Answer 2 · 2012-05-16

This is how I would explain RNA-seq to someone who is new to the area.

Step 0: You have a hypothesis. You have decided that RNA-seq will be an ideal/novel experiment to investigate your hypothesis.

Step 1: Get your samples (case/control, tumor/normal, time-series... extract your RNA and make sure you do all QC)

Library preparation: key experimental step of RNA-seq. This determines the outcome of your experiment.

Step 2: Deep sequencing (Read on next-generation sequencing. You may use one of the recent NGS platform for your sequencing. Read about them here). Make sure that you understand the lingua franca of NGS (for example: single-end vs. paired-end, coverage etc.)

Step 3: Analysis pipe-line Typical output from an RNA-seq experiment is a .fastq file with sequence reads (two files for paired end experiment). Depending on the biological question, down-stream analysis can be designed.

I am adding a highly simplified conceptual framework to understand RNA-seq analytical frameworks

Primary analysis:

QC: Quality control and removal of poor-quality reads, adapters and linkers

Secondary analysis

Mapping: Find the location where each short read best matches the reference sequence. It is ideal to progressively increase the complexity of the mapping strategy to handle the unaligned reads from your experiment. This will help to turn millions of short reads into a quantification of expression.

Summarization: Aggregate sequence reads over biological units (exons, transcripts, genes). This is where you bring biological context to your sequencing reads.

Normalization: This is the step that help you to compare expression levels between (for example cases vs. controls) and within your samples (biological vs. technical replicates). Several statistical approaches are available see: RPKM(single-read), FPKM(paired-end) Quantile normalization, House-keeping gene normalization etc.

Differential expression testing: This step help to identify genes that have changed significantly. Here you use table of summarized count data and perform statistical test between samples (pairwise or multiple group comparisons) of interest. You can use statistical techniques based on empirical bayes estimation, negative binomial distribution etc for this.

Tertiary analysis

Down-stream analysis: Creating lists of DE genes gives you an estimate of expression trends. You can now use the list(s) and perform meta-analysis to see the functional, pathway-centric or network analysis. Remember that most of the existing down-stream analysis tools are designed for gene expression data from microarray experiments. You have to use tools that are designed for RNA-seq data for down-stream analysis (for example: fusion transcripts detection tools, enrichment tools designed to use RNASeq output etc. ). Other option is that you can use only gene lists for such analysis.

Step 4: Interpretation of your results: Use the results to assess your hypothesis

Step 5: Validation using alternate techniques (resequencing of your gene of interest, quantifying transcript levels, functional studies etc. )

PS. This answer is based on references in my citeulike library (See rnaseq)

score 2 · Answer 3 · 2012-05-16

2

Entering edit mode

11.9 years ago

Arun 2.4k

I love this wiki on RNA-seq from seqanswers community; had to mention it!

ADD COMMENT • link 11.9 years ago by Arun 2.4k

score 1 · Answer 4 · 2013-01-18

1

Entering edit mode

11.3 years ago

boczniak767 ▴ 850

You could also check that material from bioconductor course (with exercises) here

ADD COMMENT • link 11.3 years ago by boczniak767 ▴ 850

Ram · Answer 5 · 2016-01-31

1

Entering edit mode

8.2 years ago

dnaseiseq ▴ 220

Hi

Just published: A survey of best practices for RNA-seq data analysis Genome Biology 2016, 17:13 doi:10.1186/s13059-016-0881-8

ADD COMMENT • link updated 4.3 years ago by Ram 43k • written 8.2 years ago by dnaseiseq ▴ 220