Question

What do these MA plots signify?

0

Entering edit mode

3.1 years ago

fufor94 ▴ 10

Hello, I am fairly new to DGE and exploratory analysis of RNAseq data. I am looking a differentially expressed genes in 116 strains of E.coli. I used Kallisto to align and quantify my genes (using the E. coli K12 as my reference genome). I have successfully run the DESeq2 pipeline and generated some MA plots. have some trouble interpreting them. Some look quite odd in my opinion and will appreciate all and any insights. Thanks. strain 1 vs ref strain

strain 2 vs ref strain

strain 3 vs ref strain

seq rna maplot sequencing deseq2 • 6.7k views

ADD COMMENT • link updated 3.1 years ago by sim.j.baum ▴ 140 • written 3.1 years ago by fufor94 ▴ 10

3

Entering edit mode

You get the expression change between the conditions on the Y-axis and the 'how strong are the genes expressed' on the X-axis. Normally lower expressed genes have a higher variability, that is in parts what you are correcting for with the methods implemented in the tools like DESeq2, Edger or limma-voom. The blue dots probably indicate if 'the gene is significantly' different between the conditions (under a certain threshold you have in your code).

ADD REPLY • link 3.1 years ago by sim.j.baum ▴ 140

0

Entering edit mode

Thank you so much for the explanation. So it is kind of similar to what I will visualize in volcano plots.

ADD REPLY • link 3.1 years ago by fufor94 ▴ 10

2

Entering edit mode

A volcano plots logFC vs -log10(p), it does not include the information about average expression. The MA does not provide information about the magnitude of the pvalue. I usually plot both as they simply provide different information.

ADD REPLY • link 3.1 years ago by ATpoint 81k

1

Entering edit mode

You could plot logFC vs -log10(p) as ATpoint suggested, and add as an argument to the plotting function the size of the (gene-) dots being log10(expression) (or log2(expression) adjusted. You have to see how it looks like ...

ADD REPLY • link 3.1 years ago by sim.j.baum ▴ 140

score 6 · Answer 1 · 2021-03-22

The first two are perfectly normal if you ask me, you simply have few DEGs. The third one indicates an unbalanced DEG profile, many genes going down in one condition. Nothing "odd" as far as I am concerned. You can use lfcShrink to shrink the logFCs, that might correct some of these larger FCs on the bottomleft of the plot, see https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#log-fold-change-shrinkage-for-visualization-and-ranking Shrinkage is basically a moderation of the fold changes. Simplified, if counts are decently high and/or standard errors of the lfcs between replicates are small then the lfcs are probably trustworthy and will stand as-is, and if not (large standard errors) then they get shrunken towards zero (as there is little/no evidence that the large biased lfcs are in fact real and not artifacts due to low counts or large variation between replicates). This is why shrunken lfcs are good for both visualization and ranking by effect size (=lfc).