Map genetic mutations to protein domain/structure
7
3
Entering edit mode
9.3 years ago
mittjohns ▴ 30

I am trying to map genetic mutations to protein domain/structure. Ideally, I want to visualize the variants in linear protein domain diagram and 3D protein structure like the attached images. I did research, but I can’t find good tools/databases for such work.

I know similar questions have been asked here like How To Create Mutation Diagram In R Or In Any Tools?. But it is only the protein domain diagram (with no 3D structure), plus the protein domain annotation there seems to be limited.

Thank you all in advance!

mutation protein domain structure • 12k views
ADD COMMENT
0
Entering edit mode

Just want to clarify, I don’t need to predict effect of the mutation on the protein structure (i.e. potential structure changes etc). I just need to map and visualize the mutation on the known 3D structure of the native protein (with the affected residues labelled). This may reveal whether/how that mutations affect the protein functions.

Another questions, what are the best/most complete protein domain database, Uniprot, Pfam, PDB or others? What’s the download link for the protein domain data?

ADD REPLY
0
Entering edit mode

Try to bring this up so that people can still see it...

ADD REPLY
2
Entering edit mode
9.3 years ago
Sam ★ 4.7k

If you want to find whether if a variant is on the protein domain, you can always check Uniprot or you can use ]KGGSeq]2 which will do the automatic search on uniprot for you if you have many SNPs. However, for the 3D diagram part, it is mainly relying on luck. If you are lucky, someone might have studied the protein structure of your protein, then you can generate the pretty graph (B, C) you have above. You can try using Protein Model Portal to try your luck and see if you can have a good enough prediction. you can then use PyMol to view the molecular level and produce the plots. But warn you, this is the rare case where you do have the protein studied. Most of the time, you won't be that lucky and won't be able to even obtain anything reasonable from the structure prediction algorithms.

ADD COMMENT
0
Entering edit mode

Thank you Sam. You mean, Uniprot provides protein domain data. Is their data more comprehensive or better than data from Pfam or PDB? BTW, where is their domain data available for download?

ADD REPLY
0
Entering edit mode

Yes, sometimes, the protein are well studies and the uniprot will tell you the location of the domain. For example, for the desmin gene, under Family & Domains, you can find a detailed location of the domains. However, I have never tried to compare Uniprot, Pfam or PDB so I am sorry that I will not be able to answer that question.

The Uniprot webpage has changed substantially so I am not exactly sure how you can download the specific domain data now. But you can definitely try the instruction here and maybe Finding Gff File For All Uniprot Protein Isoform Domain Annotations

ADD REPLY
2
Entering edit mode
9.3 years ago
poisonAlien ★ 3.2k

There is this tool called MutationMapper on cbioportal that can do what you are asking. It can even plot 3d structure with mutations highlighted onto it.

ADD COMMENT
1
Entering edit mode
9.3 years ago
Siva ★ 1.9k

Have you checked MutDB? It seems it does close to what you want. Here is a sample image from their paper.

For your second question, you can also include NCBI's CDD which is mostly manually curated. You can download the domain data here.

InterPro is also a great resource which combines domain data from several databases. I believe you should be able to download the combined data from their FTP site.

ADD COMMENT
0
Entering edit mode

Thank you for all the info. I tried the MutDB. It requires identifiers for the mutations (dbSNP etc). most time, we only have the genomic positions and nucleotide substitutions. Hence it doesn't work.

It is easy to locate the InterPro file to download, but not the ones in NCBI's CDD. Which file(s) shall I look at?

If I understand correctly, NCBI's CDD data is used to identify conserved domains in a query protein sequence and infer its putative function. What I need is the annotation data on functional domains for each protein.

ADD REPLY
1
Entering edit mode

You are correct. CDD provides PSSMs of protein domains. You need to search your protein sequences against these PSSMs using RPS-BLAST to get the domain annotation for your protein sequences.

ADD REPLY
1
Entering edit mode
7.9 years ago

Dear,

I have worked on this idea a while ago. You can have a look at http://i-pv.org/. Here are some examples:

I also made a couple of videos: http://i-pv.org/intro_ipv_alt5.html

Here is the github: https://github.com/IbrahimTanyalcin/I-PV

Here is the paper if you are more interested in the technical implementation: http://bioinformatics.oxfordjournals.org/content/32/3/447.abstract

I hope it will be helpful in your project,

Note: the jpg you attached is not visible anymore.

Regards

ADD COMMENT
0
Entering edit mode
7.9 years ago

You can try ProSat+

You can either search for a PDB or Uniprot ID or sequence and then get features from uniprot displayed in a 3D viewer on the structure. You can also add your own annotations and send the link to other people: http://prosat.h-its.org/

ADD COMMENT
0
Entering edit mode
7.9 years ago
H.Hasani ▴ 990

From bioconductor library(GenVisR), see lolliplot-mutation

ADD COMMENT
0
Entering edit mode
7.9 years ago
Collin ▴ 1000

You can try submitting the mutations to the CRAVAT web server. It provides an interactive viewer for mutations. Included is a lolli-pop diagram of the mutations with protein domain annotations. The results will also link to MuPIT so you can visualize your mutations on all available protein structures in PDB. The protein structure viewer also includes functional annotations from uniprot onto the protein structure.

ADD COMMENT

Login before adding your answer.

Traffic: 2173 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6