I have a question about mtDNA. If I extract the whole cellular genome and then I sequence it with Illumina or other NGS tecniques, can I use a bioinformatic tool to extract from all the data output (such as file FASTA) just the mitocondrial DNA? Which tools should I use?
Hi, I'm sorry for the ambiguity. It is an hypothetical question. I was wondering if it is possible extract mtDNA sequences from a WGS method, like using Illumina. The output would be the sequences of the whole cellular genome (nuclear and mitocondrial) but what if I wanna extract just the mitocondrial data ? Is it possible? Thank you for the help.
Yes, technically this is possible from WGS data, but you'll not get much output. To put it in numbers, an average WGS file, like 30-50x with 500-1000mio fragments sequenced gives you a few hundred thousand reads mapping to chrM. The question would be what you want to do with it. If you want high coverage chrM reads, download any ATAC-seq experiment, as inherent in the library preparation method (especially with the original protocol from 2013), you'll have plenty of chrM fragments in your library.
Given you have an indexed BAM file, simply run for extraction:
samtools view -bo chrM.bam total.bam chrM
Of course the chrM must be in the reference genome fasta file that you align against. What exactly do you want to do?
You mean in WGS? The kits to extract genomic DNA typically do not capture the small circular plasmid-like mitochondrial genome. Edit: This should raise the question if the alignments to chrM are indeed mtDNA or some false-positives originating from mitochondrial homologues in the genome.
There are many programs that aim at fishing mitochondrial reads (particularly from shotgun whole genome sequencing) and assembling them into a single contig, e.g. MITObim and Norgal. This is complicated by the fact there are "nuclear mitochondrial DNA segment", or NUMTs, spread over eukaryotic genomes - the WikiPedia page provides lots of references.
Hi, your question is ambiguous. Please explain if it is just hypothetical or if you are dealing with real data and which.
What do you mean by:
Hi, I'm sorry for the ambiguity. It is an hypothetical question. I was wondering if it is possible extract mtDNA sequences from a WGS method, like using Illumina. The output would be the sequences of the whole cellular genome (nuclear and mitocondrial) but what if I wanna extract just the mitocondrial data ? Is it possible? Thank you for the help.