Biostar Beta. Not for public use.
Question: Algorithms Predicting Effects Of Snps / Aa Substitution On Protein
13
Entering edit mode

There are many algorithms available online that let you assess the potential effects of a SNP or an amino acid substitution on protein function (phenotype). Some are based on sequence homology alone, some include structure and others use machine learning algorithms to include many different variables.

Which one/ones do you prefer, in terms of scientific basis, ease of use and scale of data handled etc.?

The ones I am aware of are
1. SIFT
2. Polyphen
3. MAPP
4. alignGVGD
5. panther
6. pMUT
7. SNPs3D

ADD COMMENTlink 9.2 years ago Prateek ♦ 1.0k • updated 3.9 years ago alexisdereeper • 30
5
Entering edit mode

As a rule of thumb I prefer methods that are based on conservation. One advantage of these is that they are also compatible with non-coding but highly conserved domains whereas the protein structure based methods only work for coding sequences. With more and more whole genome sequences on the horizon I suspect these conservation based methods will be more helpful with intronic and intergenic sequences as well. SIFT is the classic example for those. I also know that protein structure based tools like Polyphen are also adopting the conservation based methodology (i.e. PolyPhen 2) to improve their results.

ADD COMMENTlink 9.2 years ago Biomed 4.5k
Entering edit mode
0

I agree - also I think methods that use structure based prediction use it in addition to and not instead of sequence conservation - but I am not sure. By the way, Are you sure SIFT works with non-coding seqs? I think SIFT bases its predictions on amino acid conservation instead of nucleotide conservation.

ADD REPLYlink 9.2 years ago
Prateek
♦ 1.0k
Entering edit mode
0

No SIFT still uses precalculated data for coding-regions and also these tools are trained using snps that are known to be disease causing. So as we get more conservation data from other species and as we collect more disease cuasation/correlation for the non-coding human variations we will have tools like SIFT better predict what other non-coding variations may mean.

ADD REPLYlink 9.2 years ago
Biomed
4.5k
5
Entering edit mode

SNPEffect database is very useful for me to assess various aspects of non-synonymous SNPs. SNPEffect is an integrated resource that run individual sequence and structure based prediction algorithms to predicts whether SNP affect various features.

Sequence features: Subcellular localisation, Turnover rate, Myristoylation, PTS1, Type I Geranylation, Type II Geranylation Farnesylation, GPI Anchor, Acetylation, O-glycosylation, N-glycosylation, Phosphorylation, Propeptide cleavage sites, Signal peptides, Nuclear export signal

Structural features: Aggregation, Stability, Amylogenic regions, Transmembrane regions, Active Sites, Hsp70 Binding

IMHO, this resource can give you more insight in to the effect of SNP in protein beyond the limited information (for example: deleterious effect) provided by tools like SIFT, PolyPhen or PMut.

References: SNPEffect V1, SNPEffect V2

ADD COMMENTlink 9.2 years ago Khader Shameer 18k
Entering edit mode
0

Khader - bit late on this one, but is there a way to use mysql to access SNPEffect database?

ADD REPLYlink 5.2 years ago
arronslacey
• 240
Entering edit mode
0

Remember they had a database dump. You should check with the authors.

ADD REPLYlink 5.2 years ago
Khader Shameer
18k
5
Entering edit mode

I created snpEff program (http://snpeff.sourceforge.net/) some time ago. Please, take a look at it and let me know what you think. Also, let me know if you need a new functionality (I'm actively developing it). Answering your data scalability question: It can predict functionality for all the SNPs from 1000 genomes project in 18 minutes (using my desktop computer). I think that should be fast enough.

ADD COMMENTlink 9.2 years ago Pablo ♦ 1.9k
Entering edit mode
0

Pablo, Welcome to BioStar. SnpEff looks very promising.

ADD REPLYlink 9.2 years ago
Khader Shameer
18k
Entering edit mode
0

Pablo, Welcome to BioStar. SnpEff looks very promising, have you written any manuscript about snpEff ?

ADD REPLYlink 9.2 years ago
Khader Shameer
18k
Entering edit mode
0

Thank you Khader. I'll be writing a manuscript soon. Let me know if you have some feedback on snpEff :-)

ADD REPLYlink 9.2 years ago
Pablo
♦ 1.9k
Entering edit mode
0

in the 1000genomes SNP summary page, check the order of the charts at the end of the page.

ADD REPLYlink 9.2 years ago
Giovanni M Dall'Olio
26k
Entering edit mode
0

This tool is exactly what I am looking for ! Thank you

ADD REPLYlink 9.0 years ago
Frédéric Bigey
• 280
3
Entering edit mode

Since you are interested, have a look at this question. Earlier this morning I was writing exactly about that, maybe you can make some good contribution to the paper.

In any case, I recommend you to read this paper, where the authors predicted the effect on the phenotype of a wide number of non-synonymous snps:

Burke DF, Worth CL, Priego EM, Cheng T, Smink LJ, Todd JA, Blundell TL. Genome bioinformatic analysis of nonsynonymous SNPs. BMC Bioinformatics. 2007 Aug 20;8:301. PubMed PMID: 17708757; PubMed Central PMCID: PMC1978506.

Entering edit mode
0

Giovanni: That's a very interesting paper !

ADD REPLYlink 9.2 years ago
Khader Shameer
18k
Entering edit mode
0

Thanks for sharing. Its interesting to see the fact validated in the paper that most diseases in OMIM are associated with rare SNPs with MAF < .05, even though I see a lot of GWAS studies that for some reason exclude those.

ADD REPLYlink 9.2 years ago
Prateek
♦ 1.0k
2
Entering edit mode

You might want to check SNAP too. See http://rostlab.org/services/snap/

Chris

ADD COMMENTlink 9.2 years ago Chris ♦ 1.6k
0
Entering edit mode

SnpEff has been implemented within the online SNP pipeline: http://sniplay.southgreen.fr/cgi-bin/analysis_v3.cgi

ADD COMMENTlink 3.9 years ago alexisdereeper • 30

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0