Off topic:differences in covered regions in exome and whole genome sequencing
0
0
Entering edit mode
5.8 years ago
miaowzai ▴ 390

I have a VCF dataset from whole exome sequencing of a cohort of people. I was considering to take some people from 1000genomes data and add them to my data so that I have a bigger cohort.

To make the data (variant loci) consistent, I subsetted the 1000genomes data by the variants positions from my exome VCF data.

Since 1000genomes data was done by whole genome sequencing, I just assumed that it covers all variant loci in my exome VCF data. But when I checked the resulting file, I found that there are many variant loci (around 40~50% of all variant loci in exome VCF) in the exome VCF but not in the 1000genomes VCF. (Both data are hg19 or b37)

I was wondering what are the possible reasons for this.

Is it because 1000genomes whole genome sequencing does not have enough coverage to call all possible variants? Any other reasons? Thanks!

exome-sequencing WES genome-sequencing WGS • 859 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1471 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6