Extracted mapped shotgun metagenomic reads to reference genome. SPAdes or metaSPAdes for de-novo assembly?
1
0
Entering edit mode
5.5 years ago
O.rka ▴ 710

I have a reference strain and mapped all of my shotgun metagenomic reads to the reference strain using BBMap.

I extracted the mapped reads and want to create an assembly from this.

I usually use metaSPAdes for this but would SPAdes be better suited for this task? My PI prefers using SPAdes and metaSPAdes but I'm wondering which one would be better for this task in particular since it's only a (semi-)supervised subset of a metagenome.

[Bonus] if there is another assembler that is better suited for this exact task please let me know.

assembly metagenomics de-novo • 1.8k views
ADD COMMENT
2
Entering edit mode
5.5 years ago
h.mon 35k

Answering directly your question, I think you should use SPAdes, then also probably filter contigs diverging too much from the coverage of the longest contigs (this assumes the longest contigs belong to the strain of interest, which they should, as the reads used for assembly have been enriched for this strain, and remember contigs with rRNA reads in general have abnormally high coverage). But if the strain of interest is rare on your metagenomic sample, your resulting assembly will be very fragmented due to low coverage.

More philosophically:

Looking at some of your recent threads, it seems you have been struggling with the same issue for some days now. From older to more recent:

How to extract reads that match k-mer profiles from a collection of sequences?

How to interpret sam file generated from BBMap?

This thread

Can you assemble with merged paired end reads and unmatched reads as "single ended" reads?

So it seems you have shotgun metagenomics sequencing, but are interested in only one particular strain. It would be helpful you describe the problem in more detail, and you motivation to take this approach. This would help us evaluate if your approach is sound, or if a completely different approach is better.

The approach you have chosen seems to be mapping to a reference strain (a published genome?), and then assembling the genome using just the mapped reads. I wonder if just mapping to the reference strain and examining differences (calling SNPs / indels and structural variants) would be god enough for your purposes? Or assembling the whole metagenome, and then recovering the contigs belonging to the strain of interest?

ADD COMMENT
0
Entering edit mode

Yes, that's exactly what I'm doing: I have downloaded all of the reference genomes for a species, I'm mapping my reads to it with a wide net (BBMap default = 76% identity), getting the mapped reads and assembling these. I'm not necessarily looking for closed genomes but mostly de-novo assemblies of the organism in the samples I have. Is this method appropriate? Would it be better to use k-mer profiles instead of mapping? I planned on manually binning the contigs after to exclude any false positives.

ADD REPLY

Login before adding your answer.

Traffic: 1526 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6