Individual or population level variant calling?
2
1
Entering edit mode
9.3 years ago
Floydian_slip ▴ 170

Hi,

I have a non-technical question about variant calling with SNP: What is the advantage of performing a population-level variant calling (samples for all sub-species concurrently) as opposed to individual level (one sample at a time) with GATK? What additional info is provided in the former case?

Also, I have to do this for ~3000 samples. Is it feasible to perform population level variant calling for all 3000 concurrently using GATK? Will it take 3000-times longer than a single-sample?

Thanks in advance,

SNP population gatk variant-calling • 3.0k views
ADD COMMENT
0
Entering edit mode

See: Variation & Genotype Calling From Ngs Data - Per Sample Or Multi Sample?. The benefit depends on the per-sample depth.

ADD REPLY
2
Entering edit mode
9.3 years ago
bcosc ▴ 20

I think that GATK alone will not solve the problem of running 3000 samples. You can run scatter-gather on each sample, which could reduce the time for each sample. Ideally, you would want to run this computation on some sort of cluster/super-computer space or on AWS. Then you can run as many computations at the same time as you want. For example, you could run 3000 nodes and then run GATK on each sample on each node and it will take only the time for the longest sample to finish.

http://arvados.org is solving this by creating a completely open-source platform for managing and running computational pipelines (our code can be found here! Through a few clicks, you're able to run your 3000 samples seamlessly and be able to track provenance for each sample.

ADD COMMENT
1
Entering edit mode
9.3 years ago

Have a look at using HaplotypeCaller in GVCF mode (single sample) followed by GenotypeGVCFs. This is a relatively new workflow designed for just such situations as you describe, I think.

ADD COMMENT
0
Entering edit mode

Only works for high-coverage samples, I believe.

ADD REPLY
0
Entering edit mode

Good point.

ADD REPLY

Login before adding your answer.

Traffic: 2143 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6