How to remove spike ins from illumina beadchip data
1
1
Entering edit mode
4.4 years ago
maria2019 ▴ 250

Hi,

I am very new to microarray analysis. I have some cancer and control samples (idat files) from illumina beadchip to analyze. I was following the tutorial on limma from ( https://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf ), page 107.

When I check the EListRaw object, I get lots of ERCC as well which I did not expect!! (code and results are below).

  1. Why do I have spike ins?
  2. How can I remove them? (should I actually remove them?)

I ignored this and went through the downstream analysis but the FDR that I get is so hight (0.5 and above) which I think might be the result of I not removing ERCCs.

    $ idatfiles = dir("path", pattern = "idat",full.names = TRUE)
    $ bgxfile <- "my.bgx"
    $ x = read.idat(idatfiles, bgxfile)
    $ x$other$Detection <- detectionPValues(x)
    $ table(x$genes$Status)

>               biotin            cy3_hyb      ERCC-00002-02      ERCC-00003-01 
                         2                  6                  1                  1 
             ERCC-00004-01      ERCC-00009-01      ERCC-00012-01      ERCC-00013-01 
                         1                  1                  1                  1 
             ERCC-00014-02      ERCC-00016-01      ERCC-00017-02      ERCC-00019-01 
                         1                  1                  1                  1 
             ERCC-00022-02      ERCC-00024-02      ERCC-00025-01      ERCC-00028-02 
                         1                  1                  1                  1 
             ERCC-00031-02      ERCC-00033-01      ERCC-00034-02      ERCC-00035-02 
                         1                  1                  1                  1 
             ERCC-00039-01      ERCC-00040-01      ERCC-00041-01      ERCC-00042-01 
                         1                  1                  1                  1 
             ERCC-00043-01      ERCC-00044-02      ERCC-00046-01      ERCC-00048-01 
                         1                  1                  1                  1 
             ERCC-00051-01      ERCC-00053-01      ERCC-00054-01      ERCC-00057-01 
                         1                  1                  1                  1 
             ERCC-00058-02      ERCC-00059-01      ERCC-00060-01      ERCC-00061-02 
                         1                  1                  1                  1 
             ERCC-00062-01      ERCC-00067-02      ERCC-00069-02      ERCC-00071-01 
                         1                  1                  1                  1 
             ERCC-00073-01      ERCC-00074-01      ERCC-00075-01      ERCC-00076-02 
                         1                  1                  1                  1 
             ERCC-00077-01      ERCC-00078-01      ERCC-00079-01      ERCC-00081-02 
                         1                  1                  1                  1 
             ERCC-00083-01      ERCC-00084-01      ERCC-00085-01      ERCC-00086-01 
                         1                  1                  1                  1 
             ERCC-00092-02      ERCC-00095-01      ERCC-00096-02      ERCC-00097-01 
                         1                  1                  1                  1 
             ERCC-00098-02      ERCC-00099-01      ERCC-00104-01      ERCC-00108-02 
                         1                  1                  1                  1 
             ERCC-00109-02      ERCC-00111-01      ERCC-00112-02      ERCC-00113-01 
                         1                  1                  1                  1 
             ERCC-00116-02      ERCC-00117-02      ERCC-00120-01      ERCC-00123-01 
                         1                  1                  1                  1 
             ERCC-00126-02      ERCC-00130-01      ERCC-00131-02      ERCC-00134-01 
                         1                  1                  1                  1 
             ERCC-00136-01      ERCC-00137-02      ERCC-00138-01      ERCC-00142-02 
                         1                  1                  1                  1 
             ERCC-00143-01      ERCC-00144-02      ERCC-00145-01      ERCC-00147-01 
                         1                  1                  1                  1 
             ERCC-00148-01      ERCC-00150-01      ERCC-00154-02      ERCC-00156-01 
                         1                  1                  1                  1 
             ERCC-00157-02      ERCC-00158-01      ERCC-00160-02      ERCC-00162-01 
                         1                  1                  1                  1 
             ERCC-00163-01      ERCC-00164-01      ERCC-00165-01      ERCC-00168-01 
                         1                  1                  1                  1 
             ERCC-00170-01      ERCC-00171-01       housekeeping           labeling 
                         1                  1                  7                  2 
        low_stringency_hyb           negative            regular 
                         8                770              47231
microarray limma beadchip illumina ERCC • 1.1k views
ADD COMMENT
2
Entering edit mode
4.4 years ago

You need to leave the control probes in the data for the purpose of background correction and normalisation. Then, if you perform background correction and normalisation via neqc(), these control probes should be automatically removed.

After normalisation, you can do further filtering based on the detection p-values. Any other control probes that still remain in the data may be identified via x$genes$Source. Others that can be filtered out include those with no gene symbol (x$genes$Symbol == "")

Thus, filtering that I perform post-normalisation is like this:

Control <- project.bgcorrect.norm$genes$Source=="ILMN_Controls"
NoSymbol <- project.bgcorrect.norm$genes$Symbol == ""
isexpr <- rowSums(project.bgcorrect.norm$other$Detection <= 0.05) >= 3

project.bgcorrect.norm.filt <- project.bgcorrect.norm[!Control & !NoSymbol & isexpr, ]

dim(project.bgcorrect.norm)
dim(project.bgcorrect.norm.filt)

Kevin

ADD COMMENT
1
Entering edit mode

Hi Kevin,

Thank you very much for your answer. The code above worked for me and the result of control probes after normalization is 0. Maryam

ADD REPLY

Login before adding your answer.

Traffic: 2055 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6