Kraken reads don't seem to add up
0
0
Entering edit mode
5.5 years ago
pvishwa2 • 0

The current study I'm working on requires Kraken classification of paired NGS reads.

To test the data, my supervisor ran a sample on Galaxy, and came up with >90% reads belonging to the categorization he wanted. I tried to replicate those results on a local server using the local version of Kraken (with downloaded and built databases). Once I had my report file, I used this code:

cat test_kraken_file.report | grep "$species_name" | awk '{sum+=$1} END {print sum}'

However, I got an output that suggested only about 20% of the reads mapped to that species. On examination of the report file, I found that although 90% of the reads were categorized as belonging to the node above the "$species_name" I was looking at (the parent node), fewer mapped to the actual "$species_name" itself. Further, the children nodes of the 90% node do not have read percentages that sum up to 90%.

I want to know how to find out if there's an issue with the database, my usage of kraken, or my understanding of the data. What is good practice for someone in my situation?

next-gen genome metagenome kraken • 1.7k views
ADD COMMENT
1
Entering edit mode

The databases can be built with different parameters and different initial genomes. What databases were used for each run? What command-line options?

You don't provide sufficient details for troubleshooting the differences in results.

ADD REPLY
0
Entering edit mode

Thanks for the response, I didn't realize I'd forgotten to provide that. For the locally run Kraken test, I used a pre-built version of the Standard Kraken database. My supervisor used the "Bacteria" database on the Kraken web portal. My kraken call was:

kraken --threads 48 -db $DBNAME --fastq-input --gzip-compressed --paired --check-names foo_R1.fastq.gz foo_R2.fastq.gz > foo.kraken

The translate and report commands were called as is (no additional flags specified beyond inputs and database).

ADD REPLY

Login before adding your answer.

Traffic: 2539 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6