metagenomic assembly coverage, for multiple samples
0
0
Entering edit mode
6.0 years ago
willnotburn ▴ 50

I have multiple samples (interleaved reads), which were co-assembled into one final.contigs.fa assembly. The downstream goal is analysis of gene distribution among the samples, multivariate stats etc. To do that, the first step is to map reads from each sample back onto final.contigs.fa with bowtie2. I did that and got sam files, which I converted to sorted bam files. Now, I am trying to determine coverage. Questions:

Q1: Assembly coverage. My friend asks: what's your coverage? He means that as an assembly quality measure, and an easy number, like 30X. This post explores tools to get such a number from mpileup results.

So, do I just concatenate all my bam files and run samtools mpileup concatenated.bam...or maybe samtools mpileup *.bam? Please help me out.

Q2: Per-sample coverage. Following up on this old post, is there a difference between

samtools mpileup (options) sample1.bam sample2.bam sample3.bam

and

samtools mpileup (options) sample1.bam
samtools mpileup (options) sample2.bam
samtools mpileup (options) sample3.bam

in a way coverage is calculated (linked OP asked about variant calling).

Lastly, any opinions on what is "good coverage"? For example, if each sample has 5X-10X coverage, is that good enough?

metagenomics coverage bowtie2 samtools mpileup • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 1998 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6