Makeblast error when Nucleotide sequence contanis 'X" character
1
0
Entering edit mode
5.5 years ago

hi, i catch a error when used makeblastdb commd for make blast db, the issur as below:

New DB title:  nucl_patent_01
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
FASTA-Reader: Ignoring invalid residues at position(s): On line 34764876: 2

when i sed the line 34764876, and i find there has a 'X' character in the pos of this line, Nucleotide sequence is GXACCTGATGTAGCAGACAGTCTC, what should i do if i want make this Nucleotide sequence into my blast db? the blast version is blast 2.7.1, makeblastcmd is :

makeblastdb -in part-r-00000 -dbtype nucl -title nucl_patent_01 -out /blast_db/nucl_patent_01
sequence • 1.8k views
ADD COMMENT
0
Entering edit mode

Replacing X with N. This should work.

seqkit replace -i -s -p X -r N in.fa.gz -o out.fa.gz
ADD REPLY
0
Entering edit mode
5.5 years ago
gb ★ 2.2k

You can change the X to a N with something like this:

sed -i '/^>/! s/X/N/g' inputfasta.fa
ADD COMMENT

Login before adding your answer.

Traffic: 2676 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6