Hi! I have not worked much with eQTLs yet, so I am not sure if my question makes sense. Based on the eQTL variant composition of a genome is it possible to at least roughly estimate the potential gene expression of a gene in a sample? Thanks!
Hi! I have not worked much with eQTLs yet, so I am not sure if my question makes sense. Based on the eQTL variant composition of a genome is it possible to at least roughly estimate the potential gene expression of a gene in a sample? Thanks!
Well, it seems some guys from google were able to do it (and disclaimer, I didn't read more than the abstract):
Effective gene expression prediction from sequence by integrating long-range interactions
Žiga Avsec, Vikram Agarwal, Daniel Visentin, Joseph R. Ledsam, Agnieszka Grabska-Barwinska, Kyle R. Taylor, Yannis Assael, John Jumper, Pushmeet Kohli, David R. Kelley
https://doi.org/10.1101/2021.04.07.438649
Note as this is bioRxivx: This article is a preprint and has not been certified by peer review
Edit:
Another paper on the subject:
Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks
Vikram Agarwal, Jay Shendure
https://doi.org/10.1016/j.celrep.2020.107663
An example of how population stratification influences the prediction accuracy:
On the cross-population generalizability of gene expression prediction models
Keys KL, Mak ACY, White MJ, Eckalbar WL, Dahl AW, et al. (2020) On the cross-population generalizability of gene expression prediction models. PLOS Genetics 16(8): e1008927.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
It's been a while since I worked in the field, so here's no more than a few rough thoughts on that.
Your eQTLs might have varying influence on the expression level, and some variance may not be readily explained. For example, locus A might be associated with 50% of the variation in gene expression of Gene XYZ, while locus B and C control no more than 10% each and still 30% can't be explained (or is associated with many more small eQTLs). As long as you don't know their influence upfront, you can assume equal level influence, with all the risks associated with that.
Also, you will deliberately exclude the influence of external factors, and assumes similar stratification among both your study and sample populations assuming you know all or many existing eQTLs.
It's a bit of a long shot but not impossible. If you're aware of the risks, more on the optimist side, and you find someone to fund it you could give it a try