How is possible to compare, for a first general analysis, the proteomes of two species? In particular I want to see which proteins (or protein families) in vertebrates DO NOT have homologue counterpart in drosophila (with a very high homology cutoff). Which programs I have to use to identify and list such proteins? If I use blast which general strategy should I use? I would be very glad to have just some insights regard which programs or tutorials I have to study, Thank you a lot good work
This is a super vague answer, but in my lab people have used Scaffold and Perseus in the past for comparing enriched peptides between one sample and another.
If the species you are interested already have annotated genomes available, there is a good chance this analysis has already been done for you: check the databases OrthoDB, OMA, and others - search for
ortholog database
.If you want to find orthologs on your data (transcriptome assembly, protein set derived from genome annotation), you have to predict them, again there are several choices: OrthoMCL, OMA, ProteinOrtho, and many others. Several of them use blast (or other similarity search) as a starting point, but add varied methods to filter / refine the groups found.
See this review for an introduction (and source of databases and programs): New Tools in Orthology Analysis: A Brief Review of Promising Perspectives.
Really thank you for the clear explanation. I tried to play with biomart. I got soon aware that if u dont put gene source ensemble in the filters it basically finds that all genes are not hortologues. For example I tried human versus chimpanzee and it found 60000 genes in human with no hortologue. Putting gene source ensemble in the filter the number was just 26.... then when I try to download the file with the results it always lists around 60000 outputs... What am I missing?