Question

How Many Genes Differentially Expressed In Microarray Can Be Seen As Normal?

4

Entering edit mode

13.9 years ago

Cheng Zhongshan ▴ 400

Hi, I have a time course (0h,24h,48h,72h,96h,144h after sexual stage induction) microarray datasets about Gibberella zeae, a plant pathogen, in which about 14000 coding protein sequences,after analyzing microarray with SAS proc mixed procedure, I find about 5000 genes differetially expressed in total of these time course, is it normal? I really hope somebody can give me some suggestion. Thanks.

microarray • 4.5k views

ADD COMMENT • link updated 13.3 years ago by Stefano Berri 4.4k • written 13.9 years ago by Cheng Zhongshan ▴ 400

0

Entering edit mode

When considering this question, you must keep in mind that many of the methods for normalization of microarray data assume that only a small fraction of genes are differentially expressed. So even if a large fraction of genes actually exhibit differential expression, your analysis pipeline might not handle this data well, and you might get unpredictable or nonsensical results.

ADD REPLY • link 13.9 years ago by Ryan Thompson ★ 3.6k

score 6 · Answer 1 · 2010-05-25

Last year a paper suggested that nearly all genes are transcriptionally regulated during plant infection.

I think this might actually be the case for all organisms. When something happen the whole transcriptome is slightly regulated. Some genes have drammatic change, the other simply "adjust" to the new "state".

The fact is that, usually, you can show that only a few genes are regulated because to pass a statistical test you need either a big shift in mean expression value or many many replicates. And given the cost of microarray, the latter is rarely possible, so you end up "seeing" only those that have big swings in gene expression. Furthermore you need to correct for multiple testing, and to make sure you don't have too many false positive, you end up having many false negatives.

The above mentioned paper had 72 (!) biological replicates because it was the collection of all "controls" of a massive experiment and so their statistics is very powerful.

If you have many replicates and/or the biological replicates are very homogeneous, you might find many genes that result regulated.

score 4 · Answer 2 · 2010-05-25

4

Entering edit mode

13.9 years ago

User 59 13k

There's no metric for a 'normal' amount of genes differentially expressed in a microarray experiment, this is going to vary massively depending on your experimental conditions. I've seen very well replicated experiments that have 1000's of differentially expressed genes detectable in a very robust fashion, other very targeted experiments (siRNA knockdowns) in which only a handful of genes are perturbed.

Given that you're reporting a number of genes differentially expressed 'in total of the time course' maybe you should be looking at changes between the timepoints as well as across the whole experiment?

The real issue is that dissecting a gene list 5000 genes long to get any more meaningful information is a bit more of a challenge than dissecting one 500 genes long.

ADD COMMENT • link 13.9 years ago by User 59 13k

0

Entering edit mode

Actually, I use the 0h as a control, and make other treatments compared with it, further, I use a perl script to find the intersect part of these treatments. For example, 24h as A, 48h as B, and so on, I can have subsets like the following: A,AB,AC,AD,AE,ABC,ABD,ABE,CDE,.....ABCDE,is this making any sense?

ADD REPLY • link 13.9 years ago by Cheng Zhongshan ▴ 400

0

Entering edit mode

I'd be more tempted to use something other than a perl script for doing venn diagrams. I'd seriously consider using something that allows you to set up meaningful contrasts (Limma in BioConductor for instance) to analyse this data. There are plenty of time-course specific packages for analysing time-course data. MaSigPro comes to mind as well: http://bioinformatics.oxfordjournals.org/cgi/content/full/22/9/1096 (that reference should prove to be interesting regardless of whether or not you use the methodology)