Hi everyone,
here I am working on an evaluation of queries on variants and genotypes which needs me to import the tested vcf files into sql database. I am using MySQL and PhpMyAdmin to accomplish it but I am just a beginner.
My questions are:
If I have a .vcf file, how can I convert it into .csv? With python or can be done in vcftools and how?(I consider it's the most accessible way to import it in MySQL later)
How can I get the list of individuals and all variants that they correspond to?
I know that vcftools can filter the variants of a specific individual/sample, but what if I want the whole sample list in 1000genomes in order to create a table assigning individual to their variants, and another table assigning metadata about each variant ( I think these information equals those in a vcf file.)
Your help will be very appreciated.
Hi,
how should your mysql datatable look like?
As in the vcf file the columns are tab-seperated you allready have a format you can import via phpmyadmin. But if you want to spilt the several fields in the INFO column, more work is needed.
fin swimmer
Thank you for asking. I want to make a datatable to combine individual id and variant id, type of instance. But now I am actually feeling confused because as far as I am concerned, in a VCF there are many variants assigning to an individual, so these two column are not 1-to-1 so it could be strange to arrange the structure.
Now I have imported some data via phpmyadmin and I thought in order to access to more queries I also need to spilt the fields in the INFO column.