Why is it important to control ethnicity when studying rare variants?
2
0
Entering edit mode
8.2 years ago
MAPK ★ 2.1k

I am trying to understand the factors that could lead to erroneous identification of high burden mutation without controlling ethnicity and how it needs to be addressed. What other QC metrics need to be taken care of while studying driver mutations in cancer from "next generation sequencing data"?

association study driver mutation • 2.0k views
ADD COMMENT
3
Entering edit mode
8.2 years ago
DG 7.3k

Really this is two questions. As ethnicity and rare variant analysis and studying driver mutations in cancer are quite different from one another. Ethnicity and rare variant analysis is a very broad topic, that does impact on things like cancer driver mutation studies of course, as it really impacts all genomics. In your case, for burden testing, the main thing is that we filter out common polymorphisms and less common but not 'rare' variants from our genomic dataset. This is primarily to remove germline variants (when we didn't do matched normal germline sequencing along with our tumour). Even in the cases where we did do matched normal sequencing, we would be suspicious of a putative somatic mutation that was a known polymorphism in normal control populations. Either it wasn't called (but should have been) in the normal tissue (therefore a false positive as a somatic variant) OR it is a somatically acquired mutation but given its status as a polymorphism in normal populations, it is unlikely to be pathogenic and is just one of the many passenger somatic mutations tumour cells acquire. So it is extremely unlikely to be a driver mutation.

In cancer genomics, and even when studying Mendelian diseases and the like, we always want to try and have a representative control population for the population we are studying, but we also typically filter for rare mutations by comparing to all studied populations, with some caveats at times. Typically if we see a mutation that looks rare or novel in our population, but it is common in another ethnic population, we would still consider it 'not rare' and filter it from our results as it is unlikely to be pathogenic. When you get into studying common diseases, highly polygenic diseases, etc ethnic matching and allele frequency control becomes much more involved.

ADD COMMENT
0
Entering edit mode

Thank you so much, Dan. This exactly what I need to know. Could please elaborate how this burden test is different from GWAS which looks for rare variant with small effect in germline samples?

ADD REPLY
1
Entering edit mode

Burden tests are typically on a gene-wise basis, so looking for genes with a higher proportion of somatic mutations than what you expect based on chance.

I would recommend reading as many papers as possible to see what sorts of QC and methods they are doing. The best way to learn these things is to immerse yourself in the literature and see what the current standards are in the field. Are you working with tumour-only sequencing data?

ADD REPLY
0
Entering edit mode

Thank you, Dan.

ADD REPLY
1
Entering edit mode
8.2 years ago

MAPK, if you have not already read this Nature article, perhaps it will be of use.

Differential confounding of rare and common variants in spatially structured populations

ADD COMMENT

Login before adding your answer.

Traffic: 1805 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6