Biostars beta testing.
Question: Differential expression analysis on miRNA count files
Entering edit mode

Dear all,

I want to perform differential expression analysis between two mirSeq samples.

This is not my own project. I received bam files in the following format. However it is more similar to count file than alignment file:

@HD VN:1.0 SO:coordinate
@SQ SN:hsa-let-7a-1 LN:80
@SQ SN:hsa-let-7a-2 LN:72
@SQ SN:hsa-let-7a-3 LN:74
@SQ SN:hsa-let-7b LN:83
@SQ SN:hsa-let-7c LN:84
@SQ SN:hsa-let-7d LN:87
@SQ SN:hsa-let-7e LN:79
@SQ SN:hsa-let-7f-1 LN:87
@SQ SN:hsa-let-7f-2 LN:83
@SQ SN:hsa-mir-15a LN:83
@SQ SN:hsa-mir-16-1 LN:89
@SQ SN:hsa-mir-17 LN:84
@SQ SN:hsa-mir-18a LN:71
@SQ SN:hsa-mir-19a LN:82
@SQ SN:hsa-mir-19b-1 LN:87
@SQ SN:hsa-mir-19b-2 LN:96
@SQ SN:hsa-mir-20a LN:71
@SQ SN:hsa-mir-21 LN:72
@SQ SN:hsa-mir-22 LN:85
@SQ SN:hsa-mir-23a LN:73

Can you tell me the best way to get differential expressed miRNAs from these files?

Deseq2 does not recognize these files as count files.

I am looking for your comments


ADD COMMENTlink 15 months ago nazaninhoseinkhan • 360 • updated 13 months ago ahmad mousavi • 430
Entering edit mode

Is this indeed a BAM file? It looks like it but should not contain a header explaining the columns. What is the output of samtools view your.bam | head and samtools view -H your.bam | head

ADD REPLYlink 15 months ago
Entering edit mode

Can you elaborate how you quantified the aligned file and what tools have been used? This does not look like a count matrix.

RNAseq pipeline using feature counts and DESeq2.

ADD REPLYlink 15 months ago
♦ 1.3k
Entering edit mode

It doesn't matter which aligner, its looks like a BAM file. Can you do featurecounts with your Bam file and extract the read counts. One more thing, if you want to use DESEQ2 for DE you have to have a replicates, but you said you have two samples, so better to go with Noiseq.

ADD REPLYlink 15 months ago
• 190
Entering edit mode

The simplest way is thin online tool

ADD REPLYlink 13 months ago
• 0
Entering edit mode


As I said in my post I do not know by which program this file has been generated.

Some of my friends sent these files to me and asked me to do differential expression analysis on them. However the format of these files is not familiar. The files have are in bam format, however as you said it looks like count files.

We will try to contact to the sequencing center and asked for the raw data.

Thank you anyway

ADD COMMENTlink 15 months ago nazaninhoseinkhan • 360
Entering edit mode

That does not look like a count file, it looks like the header of a bam file, made by aligning a single sample to a fasta of short targets. It is not at all clear to me that that is the best way to align data for an experiment of this kind. But if you wanted counts, you could probably use samtools idxstats to quickly total up how many reads hit each element.

ADD COMMENTlink 15 months ago swbarnes2 5.7k
Entering edit mode

Not sure if it is the case for these files but running the bcbio smallrnaseq pipeline produces bam files with a header like that. Perhaps produced by the tools miraligner/seqbuster

If they did use this pipeline, which uses the seqbuster tool for quantification, there should also be a counts file per sample which has more obvious count type data, but also quantifies various types of isomirs, bcbio calls it $sample_name-mirbase-ready.counts

bcbio also generates an overall counts tsv with counts per mir strand per samples in the final/YYYY-MM-DD_$run_name folder called counts.tsv which could be easier if you're only interested in the mir strands not all the isomirs.

Anyways hopefully that helps you figure out the files.

ADD COMMENTlink 15 months ago msBinf • 0
Entering edit mode


First use HTSeq-count program on Human miRNA GTF file (in mirbase website) to get a count matrix from bam/sam files.

Then you could use DESeq2 or edgeR for DE analysis on miRs.

ADD COMMENTlink 13 months ago ahmad mousavi • 430
Entering edit mode

Thanks. My problem was solved a few weeks ago

ADD REPLYlink 13 months ago
• 360

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0