Problem: kmergenie has a minimum number of input of reads?
0
0
Entering edit mode
9.1 years ago

Hi all, I'm the following problem with kmergenie:

Warning! using max number of read files (2000)
error opening file: #
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

How can I increase the read number of input??

Other question is how much of memory I need to run kmergenie?

Thanks,
Leandro

K-mers Kmergenie • 4.1k views
ADD COMMENT
0
Entering edit mode

Hi!

What command line did you use? Also, what operating system?

ADD REPLY
0
Entering edit mode

Dear Rayan, I used : ./kmergenie *.csfasta

Operating systems: Biolinux (last version)

I have installed the R and python.

ADD REPLY
0
Entering edit mode

Thanks. How many *.csfasta files do you have? Kmergenie indeed has a limit on the number of input files (2000), as mentioned in the error. You could try merging them, like this:

cat *.csfasta > all.fasta

then run

./kmergenie all.fasta
ADD REPLY
0
Entering edit mode

Ah also: Kmergenie doesn't (yet) work with an input like *.fasta. (it might in the future; right now is version 1.6950)

If you have a list of fasta, please do the following:

ls -1 *.fasta > reads_list.txt
./kmergenie reads_list.txt
ADD REPLY
0
Entering edit mode

Hi, I use this command it doesn't work. Instead, it shows:

wp@debian:~/Downloads/kmergenie-1.6950$ ./kmergenie ~/data/list
running histogram estimation
File /home/wp/data/list starts with character "R", hence is interpreted as a list of file names
Reading 4 read files
error opening file: R1_001.fastq
fitting model to histograms to estimate best k
could not predict a best k value
Execution of decide failed (return code 0)

here is my list file:

R1_001.fastq
R1_002.fastq
R2_001.fastq
R2_002.fastq
ADD REPLY
0
Entering edit mode

This looks like a working directory problem. The ~/data/list file does not seem to contain absolute paths, thus you need to run kmergenie inside the ~/data/ folder.

ADD REPLY
0
Entering edit mode

Thanks a lot :)

ADD REPLY
0
Entering edit mode

Hi Rayal,

I have a side question, would you please clarify it for me:
if I run kmergenie for a pair-end read set (contain Read 1 and Read 2 fastq files), do I need to find a way to translate Read 2 into its compliment sequence before combining with Read 1 for kmergenie run? (because all sequence infomation in Read 2 is compliment to Read 1). If NOT, would it double the number of distinct kmer in statistical calculation of kmergenie? Overall, what we want to know is only 01 single strain of DNA only, isn't it?

Sorry, I am very new to this field. Thank you very much in advance!
Phuong

ADD REPLY
0
Entering edit mode

Hi Phuong,

No need. Kmergenie does not care if a read is in forward or reverse orientation, also does not care about reads are paired-end or single-end or mate-pairs. Just input all the fastq files that you would give to an assembler, in any order.

It won't double the number of kmers, as, kmergenie considers that a kmer and its reverse complement are the same object.

ADD REPLY
0
Entering edit mode

Thank you very much, this really enlightens me, especially the fact that kmergenie considers a kmer and its reverse complement are the same object.

ADD REPLY
0
Entering edit mode

Hi Hian, but I used only one input, I have one file.

ADD REPLY
0
Entering edit mode

I see.. Can you please paste the output of the following commands?

ls -1 *.csfasta
head *.csfasta

(By the way, Biostars encourages that you respond in a reply, not in a separate response, which is reserved for when an answer to the original problem is found)

ADD REPLY

Login before adding your answer.

Traffic: 2591 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6