Question

Using MiRanda for miRNA - PCG interactions

0

Entering edit mode

5.1 years ago

glady ▴ 320

Hello everyone, I working on some human samples, I have their mRNA and the miRNA dataset in triplicates. Now, I'm interested in studying the miRNA - Protein Coding Gene (PCG) interactions using these dataset. Hence, I used MiRanda, but the results which I get are more obscure.

In total, I have 10300 expressed PCG and 600 expressed miRNAs, and when I use MiRanda on these data, I get a target for every possible miRNA. (i.e. in the output file I have all the 600 miRNAs targeting 10290 PCGs). This shouldn't be true.

The MiRanda command which I used:

/media/tools/bin/miranda expressed_mirna.fasta expressed_protein.fasta -out miranda_output-1.fasta -quiet -strict

Am I missing something in the MiRanda command?
Why the results are being so obscure? there are many false positives in there.

If anyone can help me understand this. Thank you.

RNA-Seq • 2.1k views

ADD COMMENT • link updated 5.1 years ago by i.sudbery 19k • written 5.1 years ago by glady ▴ 320

0

Entering edit mode

Can you clarify? Are you saying that every miRNA targets every gene (i.e. you have 600 * 10,290 = 6,174,000 interactions) or that every miRNA targets at least one gene and every gene is targeted by at least one miRNA?

The first is clearly not (biologically) correct, but the second would not surprise me.

Further miRanda is known to be fairly liberal in calling interactions and is known to have a high false positive rate.

ADD REPLY • link 5.1 years ago by i.sudbery 19k

0

Entering edit mode

In my analysis, I observed 2 things; 1) every miRNA targets at least one gene(all the 600 expressed miRs have at least one target gene) 2) almost every gene(10290 PCGs out of the total 10300 PCGs) is targeted by at least one miRNA.

I hope, I'm clear. Yes, you are right MiRanda has a high false positive rate. But in my case, I think I'm going wrong somewhere in the MiRanda command/syntax. I don't know what I'm missing. Your comments would help. Thank you.

ADD REPLY • link 5.1 years ago by glady ▴ 320

0

Entering edit mode

No, I don't think there is anything wrong here. The results are more or less as I would expect.

ADD REPLY • link 5.1 years ago by i.sudbery 19k

0

Entering edit mode

If the commands are right then why there are so many false positives? Because, biologically it's very difficult to conclude that every miRNA will have atleast one target gene & vice versa. The results I'm getting are very complicated and confusing.

ADD REPLY • link 5.1 years ago by glady ▴ 320

2

Entering edit mode

Why would it be hard to conclude that every miRNA has at least one target? What would be the point of an miRNA that had no targets? The inverse is not so obviously true, but I see no reason to think it couldn't be true.

However it is very likely that a high proportion of the predicted interactions are false positives. There are so many false positives because miRanda produces a lot of false positives - thats just the way the tool is. Other tools have a high false negative rate. But even if you go to targetscan (which is a much more conservative tool), You'll probably still find at least one prediction for each gene.

miRNA target prediction algos are just not that reliable, and really should be confirmed by experimental evidence if any major conclusion is going to be based on their results.

ADD REPLY • link 5.1 years ago by i.sudbery 19k

0

Entering edit mode

Sorry thats not very helpful. You could filter the results to only include those that are above a higher binding energy threshold with -en or with a higher alignment score with -sc.

ADD REPLY • link 5.1 years ago by i.sudbery 19k

0

Entering edit mode

I tried to filter them for a higher alignment score(>140) & a lower binding energy(<=-20), but there is hardly any change. The output file still has a target gene for every possible miRNA(all the 600).

I think that conservation parameter is more important in predicting the "biologically true" miR-target interactions, as compared to alignment score/binding energy. I don't know whether MiRanda takes this in consideration.

ADD REPLY • link 5.1 years ago by glady ▴ 320

0

Entering edit mode

No, miRanda uses only the alignment. The primary tool that considers conservation is TargetScan - however the requirement for multispecies alignment makes it difficult to run TargetScan yourself (if you want to include conservation). You can however, download their predictions.

Even with target scan, I think you'll struggle to find a gene that is not targeted by any miRNA.

Another approach would be to look at experimental evidence - starbase contains records of a large number of Ago-CLIP experiments where Ago has been pulled down and the mRNA it is attached to sequenced. Obviously your cell type might not be there, but looking at something similar, or combining all cell types, can help to distinguish real from false predictions.

ADD REPLY • link 5.1 years ago by i.sudbery 19k

0

Entering edit mode

Thank you very much for your replies. With MiRanda, it's very difficult to find "biologically true" targets which follow the conservation parameter.

With TargetSCAN I did got some decent results, out of the total 600 miRs & 10300 PCGs, TargetSCAN in total predicted 9302 interactions (215 miRNAs targeting 7867 PCGs). I do think that TargetSCAN is very much reliable. But, in my mRNA data, I do want to find some miRNAs which are targeting lncRNAs as well, and I can't use TargetSCAN for this work. And that's why I was trying MiRanda.

Regarding the starbase, I had a look at it, but the cell line on which I'm working on (HUVEC) is not there. Do you have any other suggestions?

ADD REPLY • link 5.1 years ago by glady ▴ 320