My vcf file has SNPs available for different population(Africa, America, Europe,East Asia and South Asia ). I want to extract the data for Europe and East Asia together . Kindly let me know the possible ways.
Thanks in Advance
You can do this easily using vcftools, GATK tools, plinkseq etc.
you first have to generate a text file with the list of samples that form the population of your choice, let's say "population_of_interest.txt"
vcf-subset -e -c population_of_interest.txt input.vcf > output.vcf
vcftools --vcf input.vcf --keep population_of_interest.txt --recode > output.vcf
Thanks a ton Nandini ... it works :)
This code works fine when i run for one chromosome at a time. But, I want to extract SNPs for all chromosomes together ,please let me know if ithere is any other option ?
It should work for all chromosomes. Does your vcf input file have all chromosomes ?
@Nandini .. I have VCF file for each chromosome seperately
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy