Biostar Beta. Not for public use.
Question: Anybody Managed To Make Inparanoid Work?
0
Entering edit mode

I'm trying to use InParanoid (v4.1) to detect orthologs in two de novo transcriptomes I assembled. They were 'translated' to protein sequences using Transdecoder. The resulting fasta file I'm trying to use in InParanoid looks like this (~200 000 seqs):

>comp100291_c0_seq1
LPKKILLPIQQVLGHLLLALSYRGKVMQVKALKSKHEHNGPETLDAFLSSKLVVVKQPRE
QAGFPLSIVFIPGEGRQERFLLHGEYNQSFCKEPVMELPRQ
>comp102162_c0_seq1
PNMTLHFLKSSPGSWRLSGLVLIPYVTETISGSCETLTRLQMPAHIQQSRWKAKHGPRIL
LLGLLQNLRSLFPLKVLPPGANSQLKRNCSFTSVCLIGTFYVESS
>comp102206_c0_seq1
CQEQKWQKGNREEKGWAGVTVWGAYFPYLLIRCPNHQTSTPLSIHSQQHFMLCIIICPFS
WLKPPVKTTQMFKGFFFKSGLKKFLALFLISWAAFATDRPLLGKQQSR

I tried the example fasta files supplied with the program (called SC and EC) and it works, but when I use my files, it's stuck at the first step and it does not create any file (nor disk usage) after days. Here is what I get with my fasta files:

Loading module bio/ncbi-blast-2.2.22.
Formatting BLAST databases
Done formatting
Starting BLAST searches...

Starting first BLAST pass for bf - bf on [blastall] WARNING: the -C 3 argument is currently experimental

It then stays like this forever.

I also tried supplying my Blast results (inter-sample) generated myself that I parsed with their supplied parser but then it still stays forever at the same state, again without generating any file:

Done BLAST searches. Starting ortholog detection...

I tried with and without bootstraping, multitreading (-a16 option) or not, as I said with or without supplied blast results and I also cleaned my fasta files for any weird characters (removed annotations, all ' * ', spaces, empty lines and dots. Now I'm running out of ideas... I'm using a Unix cluster. I tried these jobs using up to 16 CPUs with 256G memory.

Anybody managed to make that program work?

ADD COMMENTlink 6.0 years ago Birdman • 20 • updated 2.3 years ago huangxiaoyun1 • 0
Entering edit mode
0

EDIT: I was able to make it work with a small subset of my sequences (a few thousands). It seems that InParanoid have problems with large datasets (hundreds of thousands)... My question now becomes: Anybody managed to make that program work with large datasets?

ADD REPLYlink 6.0 years ago
Birdman
• 20
2
Entering edit mode

I think what you're running into is an issue with InParanoid running legacy BLAST instead of BLAST+. According to this NCBI page the legacy executables have a cap at ~65K sequences and run into other issues with large data. This is fixed in BLAST+ but InParanoid runs legacy BLAST by default. The workaround with this is to update the InParanoid source code. I am working on that now, and if I can get it all to work I will update this post with a link to a Github page.

ADD COMMENTlink 5.9 years ago kristenbekc527 • 20
Entering edit mode
0

Did you succeed in making InParanoid work with BLAST+?

ADD REPLYlink 5.3 years ago
cvlas076
• 0
Entering edit mode
0

please please:)

ADD REPLYlink 5.2 years ago
Adrian Pelin
♦ 2.3k
0
Entering edit mode

I met the same problem. It also said "Blast output file A->B is missing". Have you fixed this problem?

ADD COMMENTlink 5.2 years ago 695624096 • 0
0
Entering edit mode

I am facing the similar issues, any update?

ADD COMMENTlink 4.4 years ago kudzu • 0
0
Entering edit mode

I met the same problem. It also said "Blast output file A->B is missing". Have you fixed this problem?

ADD COMMENTlink 2.3 years ago huangxiaoyun1 • 0
Entering edit mode
0

I find orthoMCL to do the job better, that's why I gave up on inparanoid.

ADD REPLYlink 2.3 years ago
Adrian Pelin
♦ 2.3k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0