Entering edit mode
5.2 years ago
zizigolu
★
4.3k
Sorry,
I have a list of bam files for instance like this
HUMAN_1000Genomes_hs37d5_RNA_seq_WTSI-COLIVM_005_1pre.***dupmarked***.bam
By dupmarked
likely duplicates have been marked; I want to extract raw read counts from these files by featurecounts; Do you just me to remove duplicates? How I know basically I should do that or not for extracting raw read counts?
Thank you for any help
RNA-seq: Should remove duplicates in all samples of same experiment although some do not have technical duplicates? and the links there in.
Sorry, owner of data says that duplicates have not been removed rather just marked in bam files; So should I remove them now?
I should mention I also have a bam.bai for each sample like
Is this RNAseq data or something else? File name seems to indicate it is RNAseq but just to be absolutely certain.
If it is RNAseq then the post linked above has all the info you need. I will also explicitly link this: mRNA-seq quality report (fastQC): Does it mean samples have adapters and should remove duplicates?
Thank you, yes this is RNA-seq definitely
Sorry, is it possible that these files are not normal RNA-seq files rather I should do Calling variants in RNAseq because in this paper they done that