Biostar Beta. Not for public use.
Clinically-Associated Snp'S
9
Entering edit mode
4.6 years ago
Vova Naumov • 220
Russia, Moscow

Hi! We are now trying to understand, what Illumina chip is better for medical condition testing. So I used this MySQL query to get list of clinically-associated SNP':

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A  -D hg19 -e '
SELECT *
FROM
  snp132 s
WHERE
  s.bitfields LIKE 'clin%' '

So now I have a list of about 22000 rs and it is interesting what association is meant by the base. There was a question on Biostar (http://biostar.stackexchange.com/questions/1289/disease-associated-snps) that could help me, but since 16 july OMIM table is not more in genome database. And the question is how can I get a list of disases/conditions from this snp list?

ADD COMMENTlink
3
Entering edit mode

hg18 does not have table snp132; I think you must have used hg19.

ADD REPLYlink
0
Entering edit mode

Sure, sorry, I'l change it

ADD REPLYlink
8
Entering edit mode
17 months ago
France/Nantes/Institut du Thorax - INSE…

1) Register an access to the FTP site of omim: http://omim.org/downloads and download mim2gene:

$ curl -s  "ftp://anonymous:xxxxxxx@xxxxx.edu/OMIM/mim2gene.txt" | head
# Mim Number    Type    Gene IDs    Approved Gene Symbols
100050    phenotype    -    -
100070    phenotype    100329167    -
100100    phenotype    -    -
100200    phenotype    -    -
100300    phenotype    100188340    -
100500    moved/removed    -    -
100600    phenotype    -    -
100640    gene    216    ALDH1A1
100650    gene/phenotype    217    ALDH2

get a list of the gene symbols:

~$ curl -s  "ftp://anonymous:xxxxx@xxxxxx.edu/OMIM/mim2gene.txt" |\
   egrep -v "#" | cut -d '  ' -f 4 | egrep -v '^\-$' |\
   sort | uniq > list1.txt

2) get your list of SNP associiated to the gene symbol. Something like:

mysql -N --user=genome --host=genome-mysql.cse.ucsc.edu -A  -D hg19 -e 'select  distinct
  G.geneSymbol,
  S.name
from snp132 as S,
kgXref as G,
knownGene as K where
    S.chrom=K.chrom and
    S.chromStart>=K.txStart and
    S.chromEnd<=K.txEnd and
    K.name=G.kgId 
    /* AND something to restrict the result to YOUR list of SNPs or gene */
' | sort -t '    ' -k1,1 > list2.txt

3) use unix join to join the two lists:

join -1 1 -2 1 list1.txt list2.txt

you should get a list with two columns: the OMIM gene and your SNP.

ADD COMMENTlink
0
Entering edit mode

Thank you very much! Allways new that these unix commands are very useful. I also tried to use /OMIM/genemap file to get rs numbers from 12th column, but there wre only 209 common rs between clinically-associated and numbers from this file.

ADD REPLYlink
4
Entering edit mode
6.1 years ago
Boston, MA USA

dbSNP includes clinically significant variations and you can now filter search results on clinical significance, allele origin, minor allele frequency, and suspected false SNPs. See http://www.ncbi.nlm.nih.gov/projects/SNP/docs/rs_attributes.html for more.

From http://www.ncbi.nlm.nih.gov/projects/SNP/docs/rs_attributes.html : Clinical significance: The significance of the indicated allele.

The supported values are:

unknown 
untested
non-pathogenic
probable-non-pathogenic
probable-pathogenic
pathogenic
drug-response
histocompatibility
other

In dbSNP build 132, there are 13105 such rs entries. While no good diefinition of "clinical significance" is given, the above examples of what NCBI classifies as such can help to form a picture of what is meant by this term.

Edit added 13 Oct 2011: I have just learned from following the International Congress of Human Genetics meeting on Twitter that Rong Chen is painstakingly manually curating 5,478 disease-SNP association papers and adding the info to a database of 67,678 SNPs associated with 1,563 diseases.

ADD COMMENTlink
2
Entering edit mode
17 months ago
Manhattan, NY

What do you mean by clinical association ? What is your criteria ?

_Mendelian disease, Complex disease, Pharmacogenomic variants or combination two or more_ ?*

If you are interested in combined dataset you need to do raw-data-munging. OMIM is ideal for Mendelian variants, for complex disease variants you should check GWAS resources, for Pharmacogenomics variants check PharmGKB.To identify cinically-associated variants from GWAS see my discussion 1, 2 and 3. For pharmacogenomics variants, see list of Annotated SNPs by Disease in PharmGKB here. A combination of the 3 resources will give you a complete coverage of SNPs for your study.

  • I recently integrated such a data-set for a manuscript using the approach discussed above.
ADD COMMENTlink
1
Entering edit mode

I'm interested too what is meant in snp 132 under clinically-associated

ADD REPLYlink
1
Entering edit mode

@Vova: Please refer to Larry's answer !

ADD REPLYlink
0
Entering edit mode

I'm interested too what is meant in snp 132 under _clinically-saaociated_

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1