Question

Transcript/Protein Expression Databases For Up/Downregulation In Certain Disease

2

Entering edit mode

11.0 years ago

mjktfw ▴ 20

Hi, I'm a medical student, definitely not a programmer, and didn't have any experience with bioinformatics during my studies, but I thought that bioinformatic databases could be extremely useful in our research. However, I'm not so sure how to use them properly.

We're interested in investigating miRNA impact on certain disease and are going to measure its expression in our samples (diseased vs non-diseased). However, to determine the ones wich are most probably causal, it would be nice to find which target genes are also up/down-regulated in the same disease. Unluckily there are no funds for that, so I thought it could be nice to infer those data from databases of previously performed experiments, regarding both transcript and protein expression (since miRNAs regulates expression on both levels).

Here are some newbie questions:

Is it proper to perform such analysis and use databases? Or maybe I just should stick to PubMed and search manually for the most reliable articles?
Which databases should I use? GEO, Gene Expression Atlas, Protein Atlas, Human Proteinpedia, any other?
How to prioritize the results. Should I look on experimental method, number of samples, size of expression change, homogeneity of results?
Are there any other approaches you would recommend?

Thank you for your help, Marcin

transcript protein expression database disease • 2.7k views

ADD COMMENT • link updated 11.0 years ago by Jelena Aleksic ▴ 920 • written 11.0 years ago by mjktfw ▴ 20

score 2 · Answer 1 · 2013-05-01

Yes, absolutely, use databases - a lot of effort goes into curating these things. Biocuration is a full time job, and unless you happen to be working on a particularly rare disease, the literature can just be overwhelming. Which isn't to say that you shouldn't use literature at all - do have a look at some of the main and most recent publications about your topics of interest, and see if what you're finding is in sync with what they report. Just don't manually curate full gene sets unless there's a really good reason to. However, if you're looking for individual studies that have a similar experimental setup to yours (which sounds like what you're doing?), I would do that through the literature first, then go to GEO through the links provided in the paper.
Depends a bit on exactly which genes/disease you're working on? My first step would probably be to put the list of genes you get from your experiment into metabolicmine.org and see what comes out, but sadly it's quite metabolic disease skewed (they're in the process of expanding it into a broader tool). A tool like DAVID should also let you see what biological processes your data is enriched for. I think Reactome also has some disease stuff.
I'm not sure I understand the question. Once you've processed your data, you should have a series of fold-changes and p-values - I would use these to filter the data, and get your gene list that way. If you want to compare your dataset to other gene expression datasets, I would cluster them together, to see whether similar genes are changing expression in a similar direction.
Maybe check out cMAP for drug prediction once you've got your data? They use expression data as input, I believe.