Comparing VCF files between two groups (15 vcf files against 15 vcf files)
1
2
Entering edit mode
5.3 years ago
Pin.Bioinf ▴ 340

Hello,

I have 15 vcf files for one type of population and 15 vcf files for another type. I want to check the differences between the two, and also the similarities. What changes from one group to another and what remains the same, and a signifcance score if possible.

I have read about PLINK but I am not sure how the pipeline should be. Which steps should I folllow? I read the documentation and it is not clear to me.

I also read about bcftools isec: which is useful to intersect multiple vcf files. So I could merge the 15 vcf files between them and the other 15 vcf files between them and end up with two files: population1_variants.vcf and population2_variants.vcf, and then compare those two against eachother and check for the differences and similarities?

Which approach is better? Is this the way people usually analyze variants among populations? How can I asess significance of the results? Are there any other approaches?

Thank you

vcf SNP variants PLINK • 2.7k views
ADD COMMENT
2
Entering edit mode
5.3 years ago

It really depends on what you want to achieve with this comparison. You could merge all VCFs and do an association analysis between the two populations using plink to find differences between the two groups or you could do a PCA using all samples to see if the two populations have a clear separation between them.

Try doing an association analysis:

plink --file mydata --assoc

Look for SNPs with statistical significance between the two groups.

http://zzz.bwh.harvard.edu/plink/anal.shtml

ADD COMMENT
0
Entering edit mode

Thank you! This seems like a nice approach, and what I was looking for. Would the mydata input be the merged 15samplescase.vcf and 15samplescontrol.vcf ? And those vcf merged should contain only the common variations among each of the 15 samples ?

Thank you

ADD REPLY

Login before adding your answer.

Traffic: 2763 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6