RNASeq reads from two bacterial species
1
0
Entering edit mode
7.2 years ago

Dear All

I am trying to analyze co-transcriptome data from two enteric pathogens. These are new clinical isolates (X and Y). I have RNASeq reads from each species grown individually (X or Y) and from the co-growing culture (X+Y). The pattern of growth observed in in vitro cultures is that X suppresses growth rate of Y and we are trying to have a mechanistic explanation for this pattern.

For this, I used trinity for de novo transcriptome assembly and then RSEM (as an example of alignment based) or Kallisto (as an example of alignment-free). I then ran DESeq2 on the read counts from both RSEM and Kallisto and compared the differentially expressed genes from each case.

I get contradicting results when using both methods: using RSEM: > 2000 genes are sig up regulated in co-growth X+Y culture relative to individually grown X and < 100 down regulated. using Kallisto: > 1500 gene are sig down regulated in co-growth X+Y culture relative to individually grown X and ~300 are up regulated.

Also the skew in the number of DEG towards being up/down regulated is a bit suspicious.

My question is which method should I follow in this case ? What is a good approach for analyzing RNASeq from such an experimental setup ?

Any insights or help will be highly appreciated Thanks

RNA-Seq kallisto RSEM trinity co-culture • 2.4k views
ADD COMMENT
2
Entering edit mode

You do not say how many biological replicates you have per condition. Different (correct) methods will arrive at different results, and even more so if the experimental design is insufficient.

Anyway, read the literature, decide on the method, and go ahead. The danger of trying too many methods is later cherry-picking the one with results suiting your expectations about the outcome.

ADD REPLY
0
Entering edit mode

Good advice. Kallisto (and like) has its applications but in case where the reference itself is not very solid it may not be the right tool.

ADD REPLY
0
Entering edit mode

That's basically why I thought it would be better to rely on de novo transcriptome assembly rather than a fragmented genome assembly (since I also have illumina DNA sequences for the same isolates). Is this assumption true ?

ADD REPLY
0
Entering edit mode

Also, neither methods seem to make sense at this point since the number of differentially expressed genes is unrealistic.

ADD REPLY
0
Entering edit mode

If organisms are very similar then this experimental approach is not likely to work in answering the question being posed.

Explanation in this case may turn out to be mundane e.g.. along the lines of organism X simply grows faster than Y in that culture conditions and out-competes it for nutrients. Have you looked at their growth rates independently in the same conditions? Can you provide some additional details about what the organisms in question are and what experimental conditions are being used.

ADD REPLY
0
Entering edit mode

The organisms are clinical isolates of Vibrio cholera (VCH) and Enterotoxigenic E. coli (ETEC). The growth rates of ETEC is faster than that of VCH when grown on M9+glucose or LB.

ADD REPLY
0
Entering edit mode

If that is the case then it is going to be difficult to use RNAseq data to find an explanation. Perhaps that is being reflected in the results you are seeing.

Since you have already done the expriment you could try the solution suggested by @h.mon below and see if that produces any useful results.

ADD REPLY
0
Entering edit mode

Thanks. I have three biological replicates from each culture and co-culture. Unfortunately, most of the literature is about handling co-transcitome from two different domains (eukaryote/prokaryote) which is a bit easier to handle since you can deal with each case as contaminant reads when attempting to quantify the other.

ADD REPLY
0
Entering edit mode

Have you checked the trinity assembly to see if it looks reasonable? Since you are working with bacteria you don't expect splicing (trinity is designed for eukaryotic transcriptomes). I wonder if you may be better off doing a normal assembly with SPAdes (or rnaSPAdes) instead.

ADD REPLY
1
Entering edit mode
7.1 years ago
h.mon 35k

What I would do:

  1. Assemble X and Y genomes separately, using both RNAseq and DNAseq - probably do diginorm first to reduce the RNAseq bias

  2. Anotate using prokka

  3. use bbsplit to find reads that map uniquely to X or to Y in all three RNAseq datasets (X, Y and X+Y)

  4. Use only this subset for the DGE analysis.

Moreover, keep in mind the difference in phenotype may be due to presence / absence of some gene(s), or some genetic variant.

ADD COMMENT
0
Entering edit mode

#3 is going to be difficult, especially if the isolates are organisms that are very similar.

ADD REPLY

Login before adding your answer.

Traffic: 1412 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6