I am facing an error when I run orthomclBlastParser:
bin/orthomclBlastParser ortholog/out.tab my_orthomcl_dir/compliantFasta/Blast/ >> similarSequences.txt acquiring genes from arab.fasta couldn't find taxon for gene 'TRINITY_DN10001_c0_g1_i2.p1' at /home/mobashirm/Documents/orthomclSoftware-v2.0.9/bin/orthomclBlastParser line 106, <f> line 1.
I run Blast between the following two files:
1) The database arab.fast file looks like:
arab|NP_001030613.1 MLLSALLTSVGINLGLCFLFFTLYSILRKQPSNVTVYGPRLVKKDGKSQQSNEFNLERLLPTAGWVKRALEPTNDEILSN arab|NP_001030614.1 MEMEEGASGVGEKIKIGVCVMEKKVFSAPMGEILDRLQSFGEFEILHFGDKVILEDPIESWPICDCLIAFHSSGYPLEKA
2) My raw file against which blast is performed:
TRINITY_DN10001_c0_g1_i1.p1 TRINITY_DN10001_c0_g1
TRINITY_DN10001_c0_g1_i1.p1 ORF type:3prime_partial len:377 (-),score=66.02 TRINITY_DN10001_c0_g1_i1:1-1128(-) MGIRSCQLIACLSALSIADAKRPTVDVAMSQAALEPPETIGGSASTQFRRSLLQAGAKSG TRINITY_DN10001_c0_g1_i2.p1 TRINITY_DN10001_c0_g1TRINITY_DN10001_c0_g1_i2.p1 ORF type:complete len:154 (-),score=0.19 TRINITY_DN10001_c0_g1_i2:112-573(-) MGIRSCQLIACLSALSIADAKRPTVDVAMSQAALEPPETIGGSASTQFRRSLLQAGAKSG TSGCKWAGAAAGCIADGSFFQSKGGFEPMDEFLACLNATTSGADLSCSPGETCCTPYLHY SSLHKQYIHSTIVKKCTFPRHIMSAVVLVYSTW*
The output file after Blast is:
TRINITY_DN10001_c0_g1_i2.p1 arab|NP_180470.2 27.08 96 63 2 22 110 29 124 3e-06 29.6 TRINITY_DN10001_c0_g1_i2.p1 arab|NP_191320.1 31.58 57 31 1 20 76 38 86 7e-06 28.9 TRINITY_DN10002_c0_g2_i1.p1 arab|NP_198034.2 31.43 70 45 1 47 116 328 394 3e-08 35.8
Please help me sort this.
Yes Philipp Bayer, I generated a new file with adjusted the trinity file accordingly but I am still facing the same error.
But looking at the BLAST example output file above the 'abc|TRINITY' .. is not there, it's just 'TRINITY..'? You can rename them in the BLAST file using sed, or by rerunning BLAST