Loss Of Heterozygosity Calls On Snp Annotated File
3
1
Entering edit mode
13.1 years ago
Jane ▴ 10

I have a dbSNP annotated SNP calls file. The file contains following columns:

Hugo_Symbol    Entrez_Gene_Id    Chromosome    Start_position    End_position    Strand    Variant_Classification        Reference_Allele    Tumor_Seq_Allele1    Tumor_Seq_Allele2    dbSNP_RS    Genome_Change    Annotation_Transcript    Transcript_Strand    cDNA_Change    Codon_Change    Protein_Change    NP_id    NP_region(note)    NP_bond    NP_site    CDD_structure_id    cosmic_total_mutations_in_gene

Now I want to make LOH (Loss of heterozygosity) calls on the basis of this data.

Is this possible?? If yes, which tool can be used??

Thanks.

snp dbsnp r • 6.0k views
ADD COMMENT
0
Entering edit mode

is there specific reason that you tag as r too?

ADD REPLY
0
Entering edit mode

Yes, the tag for "r" is odd.

For me, it would help to see some data as well because I am trying to guess what information you'd have in the columns Tumor_Seq_Allele1 and Tumor_Seq_Allele2. Could there be missing data - which would indicate possible loss of heterozygosity (LOH)?

ADD REPLY
1
Entering edit mode
13.1 years ago
Jan Oosting ▴ 920

Loss of heterozygosity implies an event where a state of heterozygosity has transformed into a state of homozygosity. This happens a lot in tumors. In order to detect LOH properly you also need the SNP calls from the original (non-tumor) tissue.

The column names you show look like this is next gen sequencing data. In this case you look for all variants in the normal sample that are heterozygous, and then you check whether these basepair positions are homozygous in the paired tumor sample.

If you also do this the other way around (look for heterozygous variants in tumor which are homozygous in normal) you get an indication of the error rate. In an ideal situation no variants are heterozygous in the tumor and homozygous in the normal tissue.

ADD COMMENT
0
Entering edit mode

But i do not have data from the original(non-tumor) tissues. Can i use dChip software for this?

ADD REPLY
0
Entering edit mode

What kind of data do you have? SNP arrays(chiptype)/Next gen sequencing(coverage/exome/full genome) Please show some example rows. How many rows per sample? How many samples? Do you have any normal tissue samples. The information you have given so far only leads to general advice.

ADD REPLY
0
Entering edit mode

If you don't have the control data (SNP alleles for non-neoplasm), then you can't calculate LOH. Without controls, your data is useless.

I also note that you have only two tumor alleles in your data. That's probably also wrong. Neoplasms tend to have lots of aneuploidy.

ADD REPLY
0
Entering edit mode

@Jan I have next gen sequencing data(exome). Some of the example rows are:

ADD REPLY
0
Entering edit mode

Hugo_Symbol Entrez_Gene_Id NCBI_Build Chromosome Start_position End_position Strand Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 dbSNP_RS dbSNP_Val_Status Tumor_Sample_Barcode Matched_Norm_Sample_Barcode Sequence_Source Sequencer Genome_Change Annotation_Transcript Transcript_Strand cDNA_Change Codon_Change Protein_Change NP_id NP_region(note) NP_bond NP_site CDD_structure_id cosmic_mutations_within_0_bp cosmic_samesub|diffsub|indel_or_diffAA|unknown cosmic_total_mutations_in_gene

ADD REPLY
0
Entering edit mode

TP53 7157 37 chr17 7578525 7578525 + Nonsense_Mutation SNP G T T novel none normal normal Capture Illumina_GAIIx g.chr17:7578525G>T NM_001126112.1 - c.405C>A c.(403-405)TGC>TGA p.C135* NP_001119584.1 P53(P53_DNA-binding_domain;_cd08367) None None CDD:176262 TP53(4) 0|0|1|3 21274

ADD REPLY
0
Entering edit mode

TP53 7157 37 chr17 7578525 7578525 + Nonsense_Mutation SNP G T T novel none normal normal Capture Illumina_GAIIx g.chr17:7578525G>T NM_001126112.1 - c.405C>A c.(403-405)TGC>TGA p.C135* NP_001119584.1 P53(P53_DNA-binding_domain;_cd08367) None None CDD:176262 TP53(4) 0|0|1|3 21274

ADD REPLY
0
Entering edit mode

PTPN11 5781 37 chr12 112892407 112892407 + Missense_Mutation SNP T G G novel none normal normal Capture Illumina_GAIIx g.chr12:112892407T>G NM_002834.3 + c.565T>G c.(565-567)TCT>GCT p.S189A NP_002825.3 SH2(Src_homology_2_domains;_Signal_transd...) None hydrophobic_binding_pocket CDD:29135 0 --- 378

ADD REPLY
0
Entering edit mode

There is no frequency column in your data, therefore you can not tell which variants are heterozygous.

From this data you can not infer LOH (You would not even be able to do so when you did have normal samples)

ADD REPLY
0
Entering edit mode

how can i get genotypes for these snps?? Please help..

ADD REPLY
0
Entering edit mode

In order to call a region as LOH in WGS data, how many minimum homozygous SNPs should be present?

ADD REPLY
0
Entering edit mode
13.1 years ago

You may be interested in this post by Don Conrad, an expert on this topic.

While the post and the links contained therein may not exactly answer your question, they will provide some very useful information.

ADD COMMENT
0
Entering edit mode
13.1 years ago
Rima • 0

how can get the genotypes of these snps?? could you please help...

ADD COMMENT

Login before adding your answer.

Traffic: 3274 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6