Question: Extract SNPs and SNP ID's from vcf file
1
Entering edit mode

Hi all,

I have vcf file.I am trying to extract ID ,SNPs and SNP ID's in the following way : It has to conform with the following format.

i)The first row should contain the IDs of subjects.
ii)The first column should contain the IDs of SNPs.
iii)Entry (i,j) should indicate the value of subject j in SNP i. The entry of the first row-first column is a string "SNP".

Please help me how to do this by using vcf or bcftools. Thanks in Advance

ADD COMMENTlink 2.2 years ago aadhirareddy1323 • 20 • updated 13 months ago zx8754 7.5k
Entering edit mode
1

You are not going to be able to get that format with just vcf or bcftools. It's also not clear what you mean by 'value of subject j in SNP i' - do you mean the genotype?

You can probably do most of this with a combination of excel and GATK's VariantsToTable tool.

ADD REPLYlink 2.2 years ago
jared.andrews07
♦ 2.4k
Entering edit mode
1

Please post example input and expected output.

ADD REPLYlink 2.2 years ago
cpad0112
11k
Entering edit mode
0

i)The first row should contain the IDs of subjects. ii)The first column should contain the IDs of SNPs. iii)Entry (i,j) should indicate the value of subject j in SNP i.

You actually describe the format of a vcf file here.

ADD REPLYlink 2.2 years ago
WouterDeCoster
39k
Entering edit mode
0

I have done it myself. Thank you for the reply :)

ADD REPLYlink 2.2 years ago
aadhirareddy1323
• 20

Login before adding your answer.

Powered by the version 1.8