Biostar Beta. Not for public use.
What is REF and ALT ? Does all variations are mutations ?
0
Entering edit mode
16 months ago
Hyderabad

I have gone through some of the publications and done a literature survey, I have a basic doubt and seek guidance from the Computational Genetic persons or Mol. Biology persons.

What is Ref and Alt ? Does all variations are considered as mutations ? It may be likely pathogenic or pathogenic or neutral but it is wise to call them mutations ? After doing annotation using Annovar, some REF and ALT are non-variations (G > G) but they are reported as mutation in PolyPhen, SIFT and other databases.

Could anyone help me to understand this concept ?

Thanks Rk

ADD COMMENTlink
4
Entering edit mode
14 months ago
EMBL-EBI

Ref = the allele in the reference genome.

Alt = any other allele found at that locus.

People use mutation in different ways and there seems to be no real consensus. Some people call any variant with a frequency lower than a certain value a mutation, and those with a higher frequency polymorphisms. Personally, I stick to the word variant for everything, and only use mutation when referring to the act of mutation, such as when talking about a somatic mutation or a de novo mutation (although they are both still variants).

Can you give an example of an input you used in Annovar that gave ref/ref, please?

ADD COMMENTlink
0
Entering edit mode

I have considered Microarray GenomeStudio Illumina v1 as input. I took CHR, POS, RSID's and GENOTYPE from the input file and converted this to VCF. During conversion, I mapped it with hg19.fa (reference genome) and this generated vcf was used in Annovar to annotate the variants (~6lkhs) of one sample.

Can you please guide saying is the analysis correct ? Here, I have REF and ALT as G>G and after annotation am getting Polyphen Sift MutationTester score for this as "D", Deleterious.

ADD REPLYlink
0
Entering edit mode

What was the actual input that gave that result? Just the one line of your VCF.

And was it that VCF you put into SIFT etc, or did you use something else as input?

ADD REPLYlink
0
Entering edit mode

No, I have given the input of all the variants (~6lkhs) as vcf to ANNOVAR. The case that I shared is for one single variant having snpid, chr, pos, ref, alt and other annotation scores as annotated from Annovar.

ADD REPLYlink
1
Entering edit mode

What @Emily is asking for is a real line from your VCF file (that shows an actual SNP). Please post that. You could even post 2 or more.

ADD REPLYlink
0
Entering edit mode
SNPS    Chr Start   End Ref Alt Func.refGene    Gene.refGene    GeneDetail.refGene  ExonicFunc.refGene  AAChange.refGene    cytoBand    SIFT_score  SIFT_pred   Polyphen2_HDIV_score    Polyphen2_HDIV_pred Polyphen2_HVAR_score    Polyphen2_HVAR_pred LRT_score   LRT_pred    MutationTaster_score    MutationTaster_pred MutationAssessor_score  MutationAssessor_pred   FATHMM_score    FATHMM_pred PROVEAN_score   PROVEAN_pred    VEST3_score CADD_raw    CADD_phred  DANN_score  fathmm-MKL_coding_score fathmm-MKL_coding_pred  MetaSVM_score   MetaSVM_pred    MetaLR_score    MetaLR_pred integrated_fitCons_score    integrated_confidence_value GERP++_RS   phyloP7way_vertebrate   phyloP20way_mammalian   phastCons7way_vertebrate    phastCons20way_mammalian    SiPhy_29way_logOdds Otherinfo   Chr_inp Pos_inp Ref Alt Unnamed: 56 Unnamed: 57 Unnamed: 58 STRONGEST SNP-RISK ALLELE   CHR_ID  REPORTED GENE(S)    MAPPED_GENE MERGED  SNP_ID_CURRENT  CONTEXT RISK ALLELE FREQUENCY   P-VALUE OR or BETA  95% CI (TEXT)   MAPPED_TRAIT    DISEASE/TRAIT

rs123456    10  648285  648285  G   G   exonic  APOH    .   nonsynonymous SNV   APOH:NM_000042:exon8:c.G1004C:p.W335S   17q24.2 0.022   D   0.999   D   0.954   D   0.01    N   0   P   2.3 M   1.96    T   -3.7    D   0.379   4.74    24.7    0.988   0.828   D   -1.109  T   0.013   T   0.516   0   5.7 0.871   0.852   1   0.998   15.345  0   17  64208285    G   .   PR  GT  0/0 rs1801690-? 17  APOH    APOH    0   1801690 missense_variant    NR  5E-13   0.08555 [0.062-0.109] unit increase novel   partial thromboplastin time Activated partial thromboplastin time
ADD REPLYlink
1
Entering edit mode

sofie_carolina : People on this forum help others freely but it does not mean you can take advantage of their generosity. It is your responsibility to provide real/accurate information so you can get usable answers.

Please don't post fake examples/incorrect data. You stand to lose more here.

ADD REPLYlink
0
Entering edit mode

Sorry for the inconvenience. I kept these things confidential as per policies of my institute. Please understand. I apologize for the same.

ADD REPLYlink
0
Entering edit mode

That's your output. Can you send the corresponding input, please?

It's also a fake example: rs123456 is not a real variant and APOH is on chr17. Could you also send us a real example, not a fake one. I can't check what you've put in against the databases unless it's actually real data.

ADD REPLYlink
0
Entering edit mode

It's Ok Mam. Thank you for your honest help. I respect your generosity. Being a researcher, I do also help to many of my colleagues and juniors. I asked this question just to clarify how bioinformaticians narrate difference between mutation and variation. Please don't misunderstand me. Thanks a lot for your help.

ADD REPLYlink
0
Entering edit mode

Without a real sample of input I can only guess at what's causing the ref/ref problem. I suspect that your input file has the reference allele listed as the alternative. Annovar takes whatever you call the alt and calculates consequences for that, which means it's comparing its reference to the allele you've inputted, which is also the reference.

What I was going to do was lookup the variant ID in a public database to see what alleles were listed. It seems likely that the variants where this has occurred have been those where the reference allele is actually the minor allele, and particularly those there the alleles have flipped between GRCh38 and GRCh37. Without seeing a real example, I cannot confirm this. Perhaps you can do this analysis yourself and confirm it.

It would have made this a lot easier for everyone if you had told us from the start that you weren't allowed to share an example.

ADD REPLYlink
0
Entering edit mode

Thanks for your kind consideration. I will perform the analysis as you have suggested. My query was just to confirm what is difference between variation and mutation. I too also have some more doubts :), If you are kind enough I can ask. One more doubt I have.

Do we need to use mutation thresholds of Polyphen, SIFT, MutationTester, CADD for Microarray data or these are only for WES/WGS/TES ?

ADD REPLYlink
0
Entering edit mode

New questions should go in new posts.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3