CD-HIT-EST and CAP3
3
0
Entering edit mode
7.9 years ago
Rahul ▴ 30

Hello, I have assembled the RNA seq lib using multi Kmer approach (Soapdenovo trans) for non model plant. now i want to merge all Kmer assemblies and remove the redundancy. Could anyone please let me know which tool would be more suitable for this exercise CAP3 or CD-HIT-EST?

RNA-Seq Assembly alignment • 3.5k views
ADD COMMENT
1
Entering edit mode
7.7 years ago
Sej Modha 5.3k

I've used cd-hit and think that does the job reasonably well.

ADD COMMENT
1
Entering edit mode
7.7 years ago

A study found that the best strategy is first the novo assembly with Trinity, followed by CAP3 to eliminate redundancy:

Optimizing de novo assembly of short-read RNA-seq data for phylogenomics BMC Genomics 2013, 14:328 http://www.ncbi.nlm.nih.gov/pubmed/23672450

ADD COMMENT
1
Entering edit mode
7.7 years ago
Biogeek ▴ 470

only use these programs if you know what you are doing. I usually use CD-HIT-EST straight after with a 95% similarity level to remove highly similar sequences then I follow up with cap3. You should read the literature and get a feel for the settings used and try to udnerstand why they use such settings. You can also filter by FPKM thresholds, but have a good reason for it. I think Trinity has accompanying scripts for that. Usually the lower supported reads are the ones that are fine to remove, but depending on what your study aim is, you may not want to remove such reads with low counts.

ADD COMMENT

Login before adding your answer.

Traffic: 1868 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6