How To Calculate Genetic Heterogeneity From Genotype Data - How Useful Is This Measure?
2
6
Entering edit mode
13.2 years ago

After providing answers to over 100 questions here, I now have one of my own. Actually, this is a two-part question. What tool(s) do you use to calculate genetic heterogeneity from SNP genotype data collected across an entire chromosome or genome? If the measure of heterogeneity is at or near zero, then the individual (human, animal, plant) is a product of inbreeding. This number will rise as the parents come from increasingly divergent genetic backgrounds.

That then brings up the second part of the question. For those of you who have looked into such measures of genetic diversity or heterogeneity, how useful is this and what kinds of values can I expect from the human genome-wide SNP genotypes I have? A preliminary and crude analysis gave me 69% of SNPs across chromosome 7 as homozygous, but that value rises to 92+% across two small HLA loci. That seems interesting but I don't know where to go with this.

Thanks in advance for any insight, advice.

genetics snp genotyping • 10k views
ADD COMMENT
1
Entering edit mode

I know that without any doubt. My feeling is the MHC will have high homogeneity. I'm interested in any tools that can do the calculations across any range of input SNPs, provided those are in genome order, and I'm curious of others' experiences with these calculations. Thanks, Al.

ADD REPLY
0
Entering edit mode

Hi Larry, I am not sure if comparing the HLA loci to the genome as a whole is a fair comparison. It would be more interesting to compare to other loci in the MHC which are likely have have undergone similar historic selection.

ADD REPLY
0
Entering edit mode

I think that's quite odd this level of homozigosity in a populational sense. Are you using the entire HapMap? How many haplotype blocks?

ADD REPLY
0
Entering edit mode

I have genotype data for an individual across the entire genome. So, I could look at heterozygosity vs homozygosity (or rates of heterogeneity) for that individual across a chromosome or gene region or region of any size.

ADD REPLY
0
Entering edit mode

As a follow-up question: are there ways to also quantify heterogeneity from RNA-seq (or transcriptomics) data?

ADD REPLY
2
Entering edit mode
13.2 years ago
Jan Oosting ▴ 920

You will have to correct for each SNP the level of homozygosity for the level of homozygosity within the population of interest. When I have few samples I use the hapmap frequencies for that, but with many samples it is probably better to calculate the population frequencies for each allele from your data.

For several chips I've noticed that the minor allele frequencies for the HLA region SNPs is quite low. This will give a high rate of homozygous SNPs if you do not correct for that.

I have a R script that takes population frequencies into account, but I will have to polish it up a bit before I can post it here.

ADD COMMENT
0
Entering edit mode

Thanks, Jan, for your comments and insight. If you wish to share your script, please contact me as I would be curious to give it a try.

ADD REPLY
1
Entering edit mode
13.2 years ago

Well,

As I understand, genetic heterogeneity is a populational measure. For haplotye imputation, I favor BEAGLE. I think that getting good and suficient data is the hard part of the business.

Sincerely, I don't know a tool really able to calculate genetic diversity/heregeneity in a population genetics sense. Only R has useful packages/tools (DEMEtics, popgen, genetics, pegas). But even those must be hacked most of the time to accept SNP data. So, normally I develop my own approach based on the ideas in this paper. Nevertheless, there are a lot of problems with such analysis. The effective number of genes per locus is highly variable across a chromosome/genome. This discrepancy is even higher between regions with quite different recombination rate. Low diversity could simply reflect insuficient populational sampling or biased haplotype reconstruction.

Complementary to it, biased gene conversion and/or genetic hitchhiking could give you the same impression. Hence, low diversity could be the result of excess recombination in the presence of homology, selection at linked loci or low effective population size at that locus. You cannot distinguish them without a linkage map or similar.

ADD COMMENT
1
Entering edit mode

Hi,

In the present scenario is there any software/tool which can Calculate Genetic Heterogeneity From Genotype or SNP Data?

Hope to hear from you

ADD REPLY
0
Entering edit mode

Thank you for your comments and link to the paper by Lynch (Estimation of Nucleotide Diversity, Disequilibrium Coefficients, and Mutation Rates from High-Coverage Genome-Sequencing Projects). I will definitely have to read that.

ADD REPLY

Login before adding your answer.

Traffic: 1608 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6