Biostar Beta. Not for public use.
How to retrieve protein sequences in Diamond?
0
Entering edit mode
16 months ago
United States

My DIAMOND https://github.com/bbuchfink/diamond output looks like:

I326_1_FC30VYFAAXX:4:1:73:1672/2        BGC0000803_GG-exopolysaccharide_Saccharide_Glf_Glf_ACN94849.1   77.3    `22      5       0       2       67      82      103     4.2e-06 45.8`

However unlike BLAST there are no sequences for each hit in the produced .m8 file (tabular file). I was wondering for DIAMOND is there an option to add the subject or query sequence in tabular file. I checked the github page of DIAMOND and did not see anything.

Thanks for any suggestions.

DIAMOND • 2.8k views
ADD COMMENTlink
0
Entering edit mode

I would suggest to look at their README-file.

https://github.com/bbuchfink/diamond/blob/master/README.rst

It;s a paragraph from the file. Is it not enough for your needs?

"We assume to have a protein database file in FASTA format named nr.faa and a file of DNA reads that we want to align named reads.fna.

In order to set up a reference database for DIAMOND, the makedb command needs to be executed with the following command line:

$ diamond makedb --in nr.faa -d nr

This will create a binary DIAMOND database file with the specified name (nr.dmnd). The alignment task may then be initiated using the blastx command like this:

$ diamond blastx -d nr -q reads.fna -a matches -t <temporary directory=""> "

Good luck!

ADD REPLYlink
0
Entering edit mode

One also needs to specify SAM output when viewing the output file, e g in your example

diamond view -a matches -f sam

ADD REPLYlink
0
Entering edit mode
3.3 years ago
Stockholm

If you choose to view the output in SAM format, the sequence will be included as one of the output fields. It depends a bit on Diamond version where SAM output is specified - either as part of "diamond blastx" or "diamond view" (after the blastx step).

Regarding the protein database, it's not clear to me which one you are referring to. I typically build my own (with "diamond makedb") in which case I usually have the original file to parse out sequences and names from, if the need should arise.

ADD COMMENTlink
0
Entering edit mode

Sequences were not included when I did diamond -view (BLAST tabular format). I will try the SAM format. Thank you!

ADD REPLYlink
0
Entering edit mode

Yes, you need to specify the SAM output format. Good luck!

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1