Domain Search - Whole Genomes
2
0
Entering edit mode
7.2 years ago
moranr ▴ 290

HI,

I have ~60 Eukaryotic genomes and I want to search for homology to all domain within genomes.

Does anyone have experience with domain searching ?

I am not sure whether pFam is ok as there is redundancies in the domains (e.g. some domains are one differ very slightly)

Do I need to define my own HMMs using HMMER and then use this as my database ?

Thanks, R

domains genome homology • 2.2k views
ADD COMMENT
1
Entering edit mode
7.2 years ago
Asaf 10k

You can use interproscan, it has several databases (including Pfam) which you can choose from or run all. Running time is very long though.

ADD COMMENT
0
Entering edit mode

Looking into it now, thanks

ADD REPLY
1
Entering edit mode
7.2 years ago
Michael 54k

That depends on what "I have...genomes" and "to all domain" means. If I 'have genomes' equates to, I was given 60 species (aka, "search all Ensembl genomes beginning with A") of sequenced genomes which are annotated already, they most likely have already been searched annotated by interproscan if it is a known domain. If you instead have 60 genomes in the drawer, then you could run Interproscan again on each. You better have predicted protein sequences for them. And still need to run that on a cluster for months. Or better download the one hmmer model and run HMMER, again you better have predicted proteins.

Do I need to define my own HMMs using HMMER and then use this as my database ?

Only if you want to search for a domain previously unknown to humanity (aka. IPR, Tiger, or PFAM, or ... id) ;)

I am not sure whether pFam is ok as there is redundancies in the domains (e.g. some domains are one differ very slightly)

Don't run only Pfam. It doesn't matter if domains differ slightly, hmmer search is more sensitive than blast to compensate.

ADD COMMENT
0
Entering edit mode

This Is very helpful, thank you :)

ADD REPLY

Login before adding your answer.

Traffic: 3237 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6