Jellyfish for transcriptome assembly
1
0
Entering edit mode
7.8 years ago
SJ Basu ▴ 50

Hello,

I have 2X150 reads of plant transcriptome and would like to assemble it using oases/velvet pipeline but I need to provide a kmer length for which I was using jellyfish. Now my question is how do I estimate a "appropriate" value for -m option in jellyfish count ??

PS: I used -m 21 to estimate kmer size for 2X250 genomic data of a bacteria and used it to assemble in velvet, it worked wonder but is not working in this case.

RNA-Seq Assembly jellyfish K-mer velvet • 2.6k views
ADD COMMENT
0
Entering edit mode

KmerGenie

KmerGenie estimates the best k-mer length for genome de novo assembly. Given a set of reads, KmerGenie first computes the k-mer abundance histogram for many values of k. Then, for each value of k, it predicts the number of distinct genomic k-mers in the dataset, and returns the k-mer length which maximizes this number. Experiments show that KmerGenie's choices lead to assemblies that are close to the best possible over all k-mer lengths. KmerGenie predictions can be applied to single-k genome assemblers (e.g. Velvet, SOAPdenovo 2, ABySS, Minia). However, multi-k genome assemblers (e.g. SPAdes, IDBA) generally perform better with default parameters (using multiple k values), rather than the single best k predicted by KmerGenie.

ADD REPLY
4
Entering edit mode
7.8 years ago

For 2x150bp, depending on your coverage, I suggest you try a few values around K=60 to 100 and see which seems to give the best assembly. Methods of estimating the best kmer length for genomes do not work well on transcriptomes due to the highly variable coverage.

ADD COMMENT

Login before adding your answer.

Traffic: 2442 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6