Removing contaminants from 16S data
1
0
Entering edit mode
5.6 years ago
agata88 ▴ 870

Hi all!

I have one case sample and one negative control (water) with 16S fastq results. At the end of 16S analysis I have table with reads per sample and identified OTUs.

Should I remove all OTUs represented in water from analysis or subtract reads?

here is an example:

        water    sample
OTU_1   95691   21841
OTU_2   11852   18378

Should I remove OTU1 and OTU2 from further analysis or just subtract reads like:

        water    sample
OTU_1   95691   0 (21841-95691)
OTU_2   11852   6526 (18378-11852)

Which approach is better?

Many thanks, Agata

16S • 2.0k views
ADD COMMENT
1
Entering edit mode

this depends on what your question is:

Are water OTUs a pure contamination in your sample and abundance difference an artifact from, for example the sample solvent (water)? -> Then I'd remove them (or think about abscence/presence analysis)

In case the abundance difference might be an indication of a microbiome adaption, then keep them.

Keep in mind that abundance might be skewed by library size and normalize. For many generic 16S tasks, there's excellent tutorials in the web, one example are the qiime2 tutorials

ADD REPLY
1
Entering edit mode
5.6 years ago
gb ★ 2.2k

Like Carambakaracho is saying, if your negative controle should be empty remove them.

Don't know if this example is real data, if so the amount of reads in the negative control seems to be very high. You can expect reads in your negative control but I would never expect more then your actual sample. Also check what kind of bacteria it is, if this organism can be found everywhere in the lab in maybe was actually present in your negative control.

ADD COMMENT
0
Entering edit mode

Thanks, This data is real and it has more reads in negative control. It's probably because of nested-PCR. I've decided to subtract reads.

ADD REPLY
0
Entering edit mode

It still maters what your goal is, if OTU_1 has a good match with say blast it is still something that is definitely present in one of your samples. 21841 reads is not just noise or junk and you are removing it now. So it is present but because of the high number of reads in the negative control it can be contamination from an other sample.

ADD REPLY

Login before adding your answer.

Traffic: 1480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6