Hi people,
I have being working on a script for removing mtdna reads from ddrad dataset. However, the only mtdna genome available is from a species of another family (too far?). I don`t know if it is because of that, but the mitochondria_stat says nothing. But maybe you could see some errors in my script.
- build index for mitochondria genome
bwa index xenopus_tropicalis_mt.fasta
- align reads to mitochondria
bwa mem xenopus_tropicalis_mt.fasta lib1_index04_TGACCA_R1.fastq.gz > reads_lib1i04.sam
- get statistics of mitochondria alignment
samtools flagstat reads_lib1i04.sam > mitochondria_stat.txt
- make a new file without mitochondria, a new file clean_reads.bam will be created
samtools view -b -f 4 -o clean_reads.bam reads$i.sam
- convert bam file to fastq, a new file clean_reads.fastq will be created
bamToFastq -i clean_reads.bam -fq clean_reads.fastq
And here it is the mitochondria stats:
0 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
0 + 0 mapped (N/A : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)