Entering edit mode
6.7 years ago
arta
▴
670
Hey,
I am analyzing RNA-seq data and I am interested in duplicated reads. I know I can count number of duplicated reads overall using picard markDuplicates.
samtools view -f 1024 dedup_reads.bam | wc -l
But i am interesting in the distribution of these duplicated. I have both sam and bam files.
Here is simple what i would to have
# total reads #duplicated reads
Gene1 30 10
Gene2 100 20
Gene3 20 0
I googled but i couldn't find any tools, is there any tools, softwares or packages? Or should I implement myself.
Great, thanks Pierre !!!