Dear all,
A few companies have reduced the cost of personalized genome sequencing using NGS to promote it's adoption by retail consumers - examples would be 30X shotgun human genome sequencing by Nebula Genomics for $300 or by Dante Labs for $600. These are only 2 companies from a much longer list (thought slightly outdates) at George Church's lab website. Few others like Veritas Genetics seem to have pivoted at least temporarily to COVID-19 testing, away from their MyGenome offering?
Anyways, I am seriously considering going down this path. My goals are to identify and interpret clinically relevant variants in my own genome assembly.
At this Nebula Genomics link , it says and I quote:
30x Whole Genome Sequencing We decode 100% of your DNA at 30x coverage using next-generation DNA sequencing technology (150bp paired-end reads), reconstruct your genome (using hg38 assembly) and identify all genetic variants. You get full access to all your DNA data including FASTQ, BAM and VCF files (> 100GB) which you can download anytime.
Before taking the plunge, I want to fully understand the strength and limitation of these data, and if and how I may add value by my own analyses. Hence, I seek your suggestions, specifically to learn about any of your experiences with personalized genomics projects, and in the context of my questions / comments below:
- Have any of you used any of these companies or similar ones, and how's been your experience?
- Have you donated your DNA samples to the NIH's All of Us program instead, and benefitted in any way?
- Is 30X coverage sufficient to identify most if not all clinically relevant SNPs?
- Since some service providers return FASTQ, BAM and VCF files - I might want to re-run assembly, annotation and variant identification against a South Asian / Indian reference genome. In that case, what could be my resources part from the relatively small studies described here - link1, link2 and link3?
- Would step 4 above even be necessary, or would (nearly) all clinically relevant variants already have been identified in my genome, assembled using hg38 as the reference (not going to be a de novo assembly from Nebula)?
- To convert VCF info to clinically relevant findings, what are open-source resources?
- Of course, any of my data storage and analyses will need to be performed in a manner that insures privacy. So if you have thoughts on how to achieve that, please share them.
Examples of some links that I've started to explore include:
- https://www.personalgenomes.org/
- https://genomelink.io/
- https://hudsonalpha.org/kit-development/
- http://igvbrowser.igib.res.in/cgi-bin/gb2/gbrowse/newigvdb/
- List of genes listed under these links - Veritas, Centogene, Prevention Genetics
Because I am trained in plant and microbial genomics and bioinformatics, I'd like to think I can execute most if not all of your suggested pipelines, if not on my laptop, then on a Galaxy Server, or a paid Amazon cloud account. But I will not assume I can climb the steep learning slope of human clinical genetics without help from subject matter experts in the forum. So, thank you very much, in advance.
-- K.Shan